Load and Save
unitpackage Entries and Collections can be loaded from different sources and stored as datapackages (CSV and JSON) to a specified output directory.
Load collections
From local files
A local collection of datapackages can be created by collecting datapackages recursively, which are stored in a specific folder in the file system.
from unitpackage.collection import Collection
db = Collection.from_local("../files")
db
[Entry('demo_package'), Entry('demo_package_cv'), Entry('demo_package_metadata')]
From URL
A collection of datapackages can be created by collecting datapackages recursively from a url to a ZIP file. The data is extracted to a temporary directory.
Note
Without providing the argument url to the from_remote method, the data shown on echemdb.org will be downloaded from the
echemdb data repository and stored as collection instead.
from unitpackage.collection import Collection
db = Collection.from_remote()
db
[Entry('alves_2011_electrochemistry_6010_f1a_solid'), Entry('alves_2011_electrochemistry_6010_f2_red'), Entry('atkin_2009_afm_13266_f4a_solid'), Entry('atkin_2009_afm_13266_f4b_solid'), Entry('berger_2017_lithium_261_f1_black'), Entry('berger_2017_lithium_261_f1_inset'), Entry('berger_2017_lithium_261_f1_pink'), Entry('briega-martos_2021_cation_48_f1cs_black'), Entry('briega-martos_2021_cation_48_f1cs_blue'), Entry('briega-martos_2021_cation_48_f1cs_green'), Entry('briega-martos_2021_cation_48_f1cs_pink'), Entry('briega-martos_2021_cation_48_f1cs_red'), Entry('briega-martos_2021_cation_48_f1li_black'), Entry('briega-martos_2021_cation_48_f1li_blue'), Entry('briega-martos_2021_cation_48_f1li_green'), Entry('briega-martos_2021_cation_48_f1li_pink'), Entry('briega-martos_2021_cation_48_f1li_red'), Entry('briega-martos_2021_cation_48_f1na_black'), Entry('briega-martos_2021_cation_48_f1na_blue'), Entry('briega-martos_2021_cation_48_f1na_green'), Entry('briega-martos_2021_cation_48_f1na_pink'), Entry('briega-martos_2021_cation_48_f1na_red'), Entry('briega_martos_2018_understanding_j3045_f1a_black'), Entry('briega_martos_2018_understanding_j3045_f1a_blue'), Entry('briega_martos_2018_understanding_j3045_f1a_green'), Entry('briega_martos_2018_understanding_j3045_f1a_red'), Entry('briega_martos_2018_understanding_j3045_f1b_black'), Entry('briega_martos_2018_understanding_j3045_f1b_blue'), Entry('briega_martos_2018_understanding_j3045_f1b_green'), Entry('briega_martos_2018_understanding_j3045_f1b_red'), Entry('briega_martos_2018_understanding_j3045_f1c_black'), Entry('briega_martos_2018_understanding_j3045_f1c_blue'), Entry('briega_martos_2018_understanding_j3045_f1c_green'), Entry('briega_martos_2018_understanding_j3045_f1c_red'), Entry('briega_martos_2018_understanding_j3045_f1d_black'), Entry('briega_martos_2018_understanding_j3045_f1d_blue'), Entry('briega_martos_2018_understanding_j3045_f1d_green'), Entry('briega_martos_2018_understanding_j3045_f1d_red'), Entry('briega_martos_2018_understanding_j3045_f1e_black'), Entry('briega_martos_2018_understanding_j3045_f1e_blue'), Entry('briega_martos_2018_understanding_j3045_f1e_green'), Entry('briega_martos_2018_understanding_j3045_f1e_red'), Entry('briega_martos_2018_understanding_j3045_f1f_black'), Entry('briega_martos_2018_understanding_j3045_f1f_blue'), Entry('briega_martos_2018_understanding_j3045_f1f_green'), Entry('briega_martos_2018_understanding_j3045_f1f_red'), Entry('briega_martos_2018_understanding_j3045_f1g_black'), Entry('briega_martos_2018_understanding_j3045_f1g_blue'), Entry('briega_martos_2018_understanding_j3045_f1g_green'), Entry('briega_martos_2018_understanding_j3045_f1g_red'), Entry('briega_martos_2018_understanding_j3045_f1h_black'), Entry('briega_martos_2018_understanding_j3045_f1h_blue'), Entry('briega_martos_2018_understanding_j3045_f1h_green'), Entry('briega_martos_2018_understanding_j3045_f1h_red'), Entry('clavilier_1980_preparation_205_f2_solid'), Entry('clavilier_1990_insitu_1_f1_solid'), Entry('clavilier_1990_insitu_1_f5_10-10-9'), Entry('clavilier_1990_insitu_1_f5_110'), Entry('clavilier_1990_insitu_1_f5_554'), Entry('clavilier_1990_insitu_1_f5_775'), Entry('domke_2003_determination_113_f2_dash-dot'), Entry('domke_2003_determination_113_f2_dash-dot-dot'), Entry('domke_2003_determination_113_f2_dashed'), Entry('domke_2003_determination_113_f2_dotted'), Entry('domke_2003_determination_113_f2_solid'), Entry('domke_2003_determination_113_f3b_dotted'), Entry('durand_1992_insitu_1977_f1a_solid'), Entry('endo_1999_in-situ_19_f1_a'), Entry('endo_1999_in-situ_19_f1_b'), Entry('engstfeld_2018_polycrystalline_17743_f4b_1'), Entry('garcia_2011_enthalpic_501_f1a_black'), Entry('garcia_2011_enthalpic_501_f1a_green'), Entry('garcia_2011_enthalpic_501_f1a_red'), Entry('gasparotto_2009_in_11140_f1_dashed'), Entry('gomez-marin_2012_surface_558_f1_black'), Entry('gomez-marin_2012_surface_558_f1_blue'), Entry('gomez-marin_2012_surface_558_f1_red'), Entry('gomez-marin_2012_surface_558_f2_blue'), Entry('gomez-marin_2012_surface_558_f2_pink'), Entry('gomez-marin_2012_surface_558_f2_red'), Entry('gomez_1993_electrochemical_189_f1_solid'), Entry('gomez_2003_effect_228_f1_dotted'), Entry('gomez_2003_effect_228_f2_dotted'), Entry('gomez_2003_effect_228_f3_dotted'), Entry('hamad_2003_electrosorption_211_f1a_dotted'), Entry('hamad_2003_electrosorption_211_f1a_solid'), Entry('horswell_2004_a_10970_f1a_dashed'), Entry('horswell_2004_a_10970_f1a_dotted'), Entry('horswell_2004_a_10970_f1a_solid'), Entry('jerkiewicz_2009_effect_12309_f1a_solid'), Entry('jovic_1999_cyclic_247_f1_dashed'), Entry('jovic_1999_cyclic_247_f1_solid'), Entry('jovic_1999_cyclic_247_f4_solid'), Entry('jovic_1999_cyclic_247_f6_dashed'), Entry('jovic_1999_cyclic_247_f6_solid'), Entry('jovic_1999_cyclic_247_f7_dashed'), Entry('jovic_1999_cyclic_247_f7_solid'), Entry('kerner_2002_measurement_2055_f2a_solid'), Entry('kerner_2002_measurement_2055_f3a_thick'), Entry('kerner_2002_measurement_2055_f3a_thin'), Entry('kerner_2002_measurement_2055_f4a_thick'), Entry('kerner_2002_measurement_2055_f4a_thin'), Entry('kerner_2002_measurement_2055_f5a_thick'), Entry('kerner_2002_measurement_2055_f5a_thin'), Entry('kerner_2002_measurement_2055_f6a_1'), Entry('kerner_2002_measurement_2055_f6a_2'), Entry('kibler_2000_in-situ_73_f4a_black'), Entry('kibler_2000_in-situ_73_f5a_black'), Entry('kibler_2000_in-situ_73_f7a_black'), Entry('kibler_2000_in-situ_73_f8a_black'), Entry('li_2000_chronocoulometric_95_f1_dash-dotted'), Entry('li_2000_chronocoulometric_95_f1_dashed'), Entry('li_2000_chronocoulometric_95_f1_dotted'), Entry('li_2000_chronocoulometric_95_f1_solid'), Entry('lipkowski_1998_ionic_2875_f1a_br'), Entry('lipkowski_1998_ionic_2875_f1a_cl'), Entry('lipkowski_1998_ionic_2875_f1a_i'), Entry('lipkowski_1998_ionic_2875_f1a_so4'), Entry('mello_2018_bromide_18562_f1a_black'), Entry('mello_2018_bromide_18562_f1a_red'), Entry('mello_2018_bromide_18562_f1b_black'), Entry('mello_2018_bromide_18562_f1b_red'), Entry('mello_2018_bromide_18562_f1c_black'), Entry('mello_2018_bromide_18562_f1c_red'), Entry('mello_2018_bromide_18562_f1d_black'), Entry('mello_2018_bromide_18562_f1d_red'), Entry('mello_2018_bromide_18562_f1e_black'), Entry('mello_2018_bromide_18562_f1e_red'), Entry('mello_2018_bromide_18562_f1f_black'), Entry('mello_2018_bromide_18562_f1f_red'), Entry('mello_2018_bromide_18562_f1g_black'), Entry('mello_2018_bromide_18562_f1g_red'), Entry('mello_2018_bromide_18562_f1h_black'), Entry('mello_2018_bromide_18562_f1h_red'), Entry('mello_2018_bromide_18562_f1i_black'), Entry('mello_2018_bromide_18562_f1i_red'), Entry('nakamura_2011_structure_165433_f1a_blue'), Entry('nakamura_2011_structure_165433_f1a_green'), Entry('nakamura_2011_structure_165433_f1a_red'), Entry('nakamura_2011_structure_165433_f1b_blue'), Entry('nakamura_2011_structure_165433_f1b_red'), Entry('nakamura_2014_structural_22136_f1a_inset'), Entry('nishihara_1994_underpotential_75_f1a_solid'), Entry('nishihara_1994_underpotential_75_f1b_solid'), Entry('nishihara_1994_underpotential_75_f2a_solid'), Entry('nishihara_1994_underpotential_75_f2b_solid'), Entry('nishihara_1994_underpotential_75_f3a_solid'), Entry('nishihara_1994_underpotential_75_f3b_solid'), Entry('nishihara_1994_underpotential_75_f4a_solid'), Entry('nishihara_1994_underpotential_75_f4b_solid'), Entry('nishihara_1994_underpotential_75_f5a_solid'), Entry('nishihara_1994_underpotential_75_f5b_solid'), Entry('nishihara_1994_underpotential_75_f6a_solid'), Entry('nishihara_1994_underpotential_75_f6b_solid'), Entry('nishihara_1994_underpotential_75_f7a_solid'), Entry('nishihara_1994_underpotential_75_f7b_solid'), Entry('ocko_1997_halide_55_f6a_solid'), Entry('pajkossy_1996_impedance_209_f1a_dotted'), Entry('pajkossy_1996_impedance_209_f1a_solid'), Entry('pajkossy_1996_impedance_209_f1b_dotted'), Entry('pajkossy_1996_impedance_209_f1b_solid'), Entry('pajkossy_1996_impedance_209_f3a_a'), Entry('pajkossy_1996_impedance_209_f3a_b'), Entry('pajkossy_1996_impedance_209_f3a_dashed'), Entry('pajkossy_1996_impedance_209_f8a_a'), Entry('pajkossy_1996_impedance_209_f8a_b'), Entry('pajkossy_1996_impedance_209_f8a_dashed'), Entry('pajkossy_2001_double_3063_f2_inset'), Entry('pajkossy_2001_double_3063_f5a_solid'), Entry('pajkossy_2001_double_3063_f6a_solid'), Entry('rehim_1998_electrochemical_1103_f1_solid'), Entry('rudnev_2020_structural_501_f2c_1'), Entry('sandbeck_2019_dissolution_2997_f1a_solid_red'), Entry('sandbeck_2019_dissolution_2997_f1b_solid_blue'), Entry('sandbeck_2019_dissolution_2997_f1c_solid_green'), Entry('sato_2006_effect_725_f4a_red'), Entry('sato_2006_effect_725_f4b_red'), Entry('sato_2006_effect_725_f4c_red'), Entry('sato_2006_effect_725_f4d_red'), Entry('schnaidt_2017_a_4141_f2_solid'), Entry('schuett_2021_electrodeposition_20461_fs1_blue'), Entry('shi_1996_chloride_225_f1_dotted'), Entry('shi_1996_chloride_225_f1_solid'), Entry('shi_1996_chloride_225_f1a_dashed'), Entry('shi_1996_chloride_225_f1a_solid'), Entry('taguchi_2007_electrochemical_6023_f2a_solid'), Entry('taguchi_2007_electrochemical_6023_f2b_solid'), Entry('taguchi_2007_electrochemical_6023_f2c_solid'), Entry('taguchi_2007_electrochemical_6023_f2d_solid'), Entry('taguchi_2007_electrochemical_6023_f2e_solid'), Entry('taguchi_2007_electrochemical_6023_f2f_solid'), Entry('taguchi_2007_electrochemical_6023_f2g_solid'), Entry('taguchi_2007_electrochemical_6023_f2h_solid'), Entry('taguchi_2007_electrochemical_6023_f2i_solid'), Entry('wandlowski_1996_structural_10277_f1a_dashed'), Entry('wandlowski_1996_structural_10277_f1a_solid'), Entry('wang_1996_ordered_6672_f1_solid'), Entry('wang_1996_ordered_6672_f2a_solid'), Entry('wang_1996_ordered_6672_f2b_solid'), Entry('wang_1997_lateral_1_f1a_solid'), Entry('wen_2015_potential-dependent_6062_f1_black'), Entry('wen_2015_potential-dependent_6062_f1_blueinset'), Entry('wen_2015_potential-dependent_6062_fs1_black'), Entry('zei_1991_the_295_f3a_solid'), Entry('zei_1991_the_295_f3b_solid')]
Providing an output directory with the parameter outdir allows saving the packages in a specific output directory.
A parameter data allows specifying the folder within the ZIP containing the datapackages.
from unitpackage.collection import Collection
db = Collection.from_remote(data='data', outdir='generated/from_url')
Load entries
An individual entry can be loaded from a local datapackage.
from unitpackage.entry import Entry
entry = Entry.from_local("../files/demo_package.json")
entry
Entry('demo_package')
Alternatively, entries can be created from CSV files or pandas DataFrames.
Metadata and field descriptors can be added after creation using update_fields() and metadata.from_dict().
From a CSV file:
from unitpackage.entry import Entry
csv_entry = Entry.from_csv(csvname="../files/demo_package.csv")
csv_entry
Entry('demo_package')
For CSV files with more complex structures, additional arguments can be provided:
header_lines— number of header lines to skip before the datacolumn_header_lines— number of lines containing column headers (multiple lines are flattened and separated by/)decimal— decimal separator (e.g.,','for European-style numbers)delimiters— column delimiter (auto-detected if not specified)encoding— file encoding
For example, a CSV with multiple header lines:
csv_entry = Entry.from_csv(csvname='../../examples/from_csv/from_csv_multiple_headers.csv', column_header_lines=2)
csv_entry.fields
[{'name': 'E / V', 'type': 'integer'},
{'name': 'j / A / cm2', 'type': 'integer'}]
For even more complex file formats from laboratory equipment, see the Loaders section.
From specific device file formats
Files from laboratory equipment (devices) often have complex structures with lengthy headers, non-standard delimiters, and instrument-specific column names.
Entry.from_csv supports a device parameter that selects the appropriate loader for the file format.
For example, loading a BioLogic EC-Lab MPT file:
from unitpackage.entry import Entry
entry = Entry.from_csv(csvname='../../test/loader_data/eclab_cv.mpt', device='eclab')
entry
Entry('eclab_cv')
The loader automatically detects headers and delimiters. The resulting entry contains the raw column names from the instrument:
entry.fields
[{'name': 'mode', 'type': 'integer'},
{'name': 'ox/red', 'type': 'integer'},
{'name': 'error', 'type': 'integer'},
{'name': 'control changes', 'type': 'integer'},
{'name': 'counter inc.', 'type': 'integer'},
{'name': 'time/s', 'type': 'number'},
{'name': 'control/V', 'type': 'number'},
{'name': 'Ewe/V', 'type': 'number'},
{'name': '<I>/mA', 'type': 'number'},
{'name': 'cycle number', 'type': 'number'},
{'name': '(Q-Qo)/C', 'type': 'number'},
{'name': 'I Range', 'type': 'integer'},
{'name': 'P/W', 'type': 'number'}]
Information on the file structure is stored in the entry’s metadata under dsvDescription:
entry.metadata['dsvDescription']['loader']
'ECLabLoader'
Domain-specific loading
For submodules such as echemdb, convenience methods provide additional processing.
EchemdbEntry.from_mpt loads an MPT file and then updates the fields with units, renames them to short standardized names, and keeps only the most relevant columns for electrochemistry:
from unitpackage.database.echemdb_entry import EchemdbEntry
entry = EchemdbEntry.from_mpt('../../test/loader_data/eclab_cv.mpt')
entry.df.head()
Fields with names ['Ri/Ohm', '-Im(Z)/Ohm', '-Im(Zce)/Ohm', '-Im(Zwe-ce)/Ohm', '(Q-Qo)/mA.h', '<Ece>/V', '<Ewe>/V', '|Ece|/V', '|Energy|/W.h', '|Ewe|/V', '|I|/A', '|Y|/Ohm-1', '|Z|/Ohm', '|Zce|/Ohm', '|Zwe-ce|/Ohm', 'Analog IN 1/V', 'Analog IN 2/V', 'Analog IN 3/V', 'Capacitance charge/µF', 'Capacitance discharge/µF', 'Capacity/mA.h', 'charge time/s', 'Conductivity/S.cm-1', 'control/mA', 'control/V/mA', 'Cp-2/µF-2', 'Cp/µF', 'Cs-2/µF-2', 'Cs/µF', 'cycle time/s', 'd(Q-Qo)/dE/mA.h/V', 'dI/dt/mA/s', 'discharge time/s', 'dQ/C', 'dq/mA.h', 'dQ/mA.h', 'Ece/V', 'Ecell/V', 'Efficiency/%', 'Energy charge/W.h', 'Energy discharge/W.h', 'Energy/W.h', 'Ewe-Ece/V', 'freq/Hz', 'half cycle', 'I/mA', 'Im(Y)/Ohm-1', 'Ns changes', 'Ns', 'NSD Ewe/%', 'NSD I/%', 'NSR Ewe/%', 'NSR I/%', 'Phase(Y)/deg', 'Phase(Z)/deg', 'Phase(Zce)/deg', 'Phase(Zwe-ce)/deg', 'Q charge/discharge/mA.h', 'Q charge/mA.h', 'Q charge/mA.h/g', 'Q discharge/mA.h', 'Q discharge/mA.h/g', 'R/Ohm', 'Rcmp/Ohm', 'Re(Y)/Ohm-1', 'Re(Z)/Ohm', 'Re(Zce)/Ohm', 'Re(Zwe-ce)/Ohm', 'step time/s', 'THD Ewe/%', 'THD I/%', 'x', 'z cycle'] were provided but do not appear in the field names of tabular resource ['mode', 'ox/red', 'error', 'control changes', 'counter inc.', 'time/s', 'control/V', 'Ewe/V', '<I>/mA', 'cycle number', '(Q-Qo)/C', 'I Range', 'P/W'].
| t | E | I | cycle | |
|---|---|---|---|---|
| 0 | 86.761598 | 0.849737 | 0.001722 | 1.0 |
| 1 | 86.772598 | 0.849149 | -0.003851 | 1.0 |
| 2 | 86.792598 | 0.848137 | -0.004390 | 1.0 |
| 3 | 86.812598 | 0.847135 | -0.004741 | 1.0 |
| 4 | 86.832598 | 0.846144 | -0.004914 | 1.0 |
The fields now have units, short names, and a reference to the original BioLogic column name:
entry.fields
[{'name': 't',
'type': 'number',
'description': 'Time.',
'unit': 's',
'dimension': 't',
'originalName': 'time/s'},
{'name': 'E',
'type': 'number',
'description': 'WE potential versus REF.',
'unit': 'V',
'dimension': 'E',
'originalName': 'Ewe/V'},
{'name': 'I',
'type': 'number',
'description': 'Average current over the potential step (calculated from I = '
'dQ/dt).',
'unit': 'mA',
'dimension': 'I',
'originalName': '<I>/mA'},
{'name': 'cycle',
'type': 'number',
'description': 'Cycle number.',
'originalName': 'cycle number'}]
From a pandas DataFrame:
import pandas as pd
from unitpackage.entry import Entry
data = {'x': [1,2,3], 'v': [1,3,2]}
df = pd.DataFrame(data)
df_entry = Entry.from_df(df, basename='df_data')
df_entry
Entry('df_data')
For more details on adding metadata and field descriptions, see Creating Unitpackages.
Save entries
Entries can be saved as JSON and CSV in a specified folder either directly from a collection
from unitpackage.collection import Collection
db = Collection.from_local("../files")
db.save_entries(outdir="../generated/files")
or from a single entry.
from unitpackage.entry import Entry
entry = Entry.from_local("../files/demo_package.json")
entry.save(outdir="../generated/files/saved_entry")
The basename of the entry can be modified.
entry.save(basename=entry.identifier + "_r" , outdir="../generated/files/saved_entry")