Welcome to echemdb-converters’s documentation!
echemdbconverters
provides a modular API for loading non-standardized DSV (data-separated value) or CVS (comma-separated value) files, commonly created from software used to operate laboratory equipment.
Key issues of these files are, for example, lengthy header lines containing various metadata relevant to the recording software,
the use of ,
as a decimal separator in some regions of this world,
or files containing multiple data tables, etc.
echemdbconverters
provides a mean to load data directly as a pandas
Data Frame and allows conversion of data via a CLI into frictionless Data Packages (or unitpackages supporting the use of units) for seamless integration in existing workflows.
Our approach aims at providing a single interface to load data into a certain format independent of the data source.
Filetypes supported and tested by echemdbconverters
are:
Manufacturer |
Device type |
Software |
Filesuffix |
Loader |
device |
---|---|---|---|---|---|
Biologic |
Potentiostat |
EClab |
mpt |
EClabLoader |
eclab |
Gamry |
Potentiostat |
Gamry Instruments Framework |
DTA |
GamryLoader |
gamry |
Todo
Improve table, such as including links.
Examples
Consider the following DSV. It consists of three parts:
the header usually contains metadata relevant to the software and user predefined settings.
column header lines containing acronyms (dimensions) and often units for the data in one ore more rows
the data block, where each column consists of identical data types
Show code cell source
from io import StringIO
file = StringIO('''# I am messy data
Random stuff
maybe metadata : 3
in different formats = abc123
hopefully, some information
on where the data block starts!
t\tE\tj
s\tV\tA/cm2
0\t0\t0
1\t1\t1
2\t2\t2
''')
from echemdbconverters.baseloader import BaseLoader
csv = BaseLoader(file, header_lines=6, column_header_lines=2)
file.seek(0)
print(csv.file.read())
# I am messy data
Random stuff
maybe metadata : 3
in different formats = abc123
hopefully, some information
on where the data block starts!
t E j
s V A/cm2
0 0 0
1 1 1
2 2 2
A pandas Data Frame can be created with limited input data.
The delimiter of the data block is evaluated using the clevercsv
module (unless specified).
Multiple column headers will be flattened.
from echemdbconverters.baseloader import BaseLoader
csv = BaseLoader(file, header_lines=6,
column_header_lines=2,
delimiters=None,
decimal=None)
csv.df
t / s | E / V | j / A/cm2 | |
---|---|---|---|
0 | 0 | 0 | 0 |
1 | 1 | 1 | 1 |
2 | 2 | 2 | 2 |
All parts of the file are accessible from the API for further use. For example the extraction of metadata from the header.
print(csv.header.read())
# I am messy data
Random stuff
maybe metadata : 3
in different formats = abc123
hopefully, some information
on where the data block starts!
print(csv.column_headers.read())
t E j
s V A/cm2
print(csv.data.read())
0 0 0
1 1 1
2 2 2
The data can also be converted into frictionless Data Packages using the CLI.
Note
The input and output files for and from the following commands can be found in the test folder of the repository.
The CLI only works for standard CSV without header and a single column header line, and specific converters summarized above.
A “standard” CSV
!echemdbconverters csv ../test/data/default.csv --outdir ../test/generated
/home/runner/work/echemdb-converters/echemdb-converters/.pixi/envs/dev/lib/python3.12/site-packages/stringcase.py:247: SyntaxWarning: invalid escape sequence '\W'
return re.sub("\W+", "", string)
A specific file type, including additional YAML metadata.
!echemdbconverters csv ../test/data/eclab_cv.mpt --device eclab --metadata ../test/data/eclab_cv.mpt.metadata --outdir ../test/generated
Further usage
Use echemdbs’ unitpackage
to browse, modify and visualize the Data Packages.
from unitpackage.collection import Collection
db = Collection.from_local('../test/generated')
entry = db['eclab_cv']
entry
Entry('eclab_cv')
Installation
This package is available on PiPY and can be installed with pip:
pip install echemdbconverters
See the installation instructions for further details.
License
The contents of this repository are licensed under the GNU General Public License v3.0 or, at your option, any later version.