echemdbconverters.baseloader
TODO:: Include module imports in each doctest. TODO:: Reword Loader for CSV files (https://datatracker.ietf.org/doc/html/rfc4180) which consist of a single header line containing the column (field) names and rows with comma separated values.
In pandas the names of the columns are referred to as column_names, whereas in a frictionless datapackage the column names are called fields. The datapackage contains information about, i.e., the type of data, a title and a set of descriptors.
The CSV object has the following properties:
- TODO:: Add examples for the following functions
a DataFrame
the column names
the header contents
the number of header lines
Loaders for non standard CSV files can be called:
TODO:: Add example
- class echemdbconverters.baseloader.BaseLoader(file, header_lines=None, column_header_lines=None, decimal=None, delimiters=None)
Loads a CSV, where the first line must contain the column (field) names and the following lines comma separated values.
EXAMPLES:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.df a b 0 0 0 1 1 1
A list of column names:
>>> csv.column_header_names ['a', 'b']
TODO: Link to device list in the documentation. More specific loaders can be selected.:
>>> from io import StringIO >>> file = StringIO('''EC-Lab ASCII FILE ... Nb header lines : 6 ... ... Device metadata : some metadata ... ... mode\ttime/s\tEwe/V\t<I>/mA\tcontrol/V ... 2\t0\t0.1\t0\t0 ... 2\t1\t1.4\t5\t1 ... ''') >>> from echemdbconverters.baseloader import BaseLoader >>> csv = BaseLoader.create('eclab')(file) >>> csv.df mode time/s Ewe/V <I>/mA control/V 0 2 0 0.1 0 0 1 2 1 1.4 5 1
- property column_header_lines
The number of lines containing the descriptive information of the data for each column.
EXAMPLES:
A file with a single column header line:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.column_header_lines 1
A file with a two column header lines:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... x,y ... 0,0 ... 1,1''') >>> csv = BaseLoader(file, column_header_lines=2) >>> csv.column_header_lines 2
- property column_header_names
A list of column header names constructed from the lines containing the column head names.
EXAMPLES:
A file with a single column header line:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.column_header_names ['a', 'b']
For a file containing two or more column header lines, we create a single name for each column including the information from the following lines and separating those with a
/
.:>>> from io import StringIO >>> file = StringIO(r'''T,v ... K,m/s ... 0,0 ... 1,1''') >>> csv = BaseLoader(file, column_header_lines=2) >>> csv.column_header_names ['T / K', 'v / m/s']
- property column_headers
The lines in the file containing the descriptive information of the data for each column.
EXAMPLES:
A file with a single column header line:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.column_headers.readlines() ['a,b\n']
A file with two column header lines, which is sometimes, for example, used for storing units to the values:
>>> from io import StringIO >>> file = StringIO(r'''T,v ... K,m/s ... 0,0 ... 1,1''') >>> csv = BaseLoader(file, column_header_lines=2) >>> csv.column_headers.readlines() ['T,v\n', 'K,m/s\n']
- static create(device=None)
Calls a specific loader based on a given device.
EXAMPLES:
>>> from io import StringIO >>> file = StringIO('''EC-Lab ASCII FILE ... Nb header lines : 6 ... ... Device metadata : some metadata ... ... mode\ttime/s\tEwe/V\t<I>/mA\tcontrol/V ... 2\t0\t0.1\t0\t0 ... 2\t1\t1.4\t5\t1 ... ''') >>> csv = BaseLoader.create('eclab')(file) >>> csv.df mode time/s Ewe/V <I>/mA control/V 0 2 0 0.1 0 0 1 2 1 1.4 5 1
- property data
A file like object with the data of the CSV without header lines.
EXAMPLES:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> type(csv.data) <class '_io.StringIO'> >>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.data.readlines() ['0,0\n', '1,1']
- property decimal
The decimal separator in the floats in the CSV data.
EXAMPLES:
A standard CVS containing floats with a single header line:
>>> from io import StringIO >>> file = StringIO('''a,b ... 0.0,0.0 ... 1.0,1.0''') >>> csv = BaseLoader(file) >>> csv.decimal '.'
For CVS containing only integers we simply return None.:
>>> from io import StringIO >>> file = StringIO('''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.decimal '.'
A standard CVS containing integers with a single header line:
>>> from io import StringIO >>> file = StringIO('''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.decimal '.'
A standard CVS containing integers and floats with a single header line:
>>> from io import StringIO >>> file = StringIO('''a,b ... 0,0.0 ... 1,1.0''') >>> csv = BaseLoader(file) >>> csv.decimal '.'
A TSV containing floats with a single header line:
>>> from io import StringIO >>> file = StringIO('''a\tb ... 0\t0.0 ... 1\t1.0''') >>> csv = BaseLoader(file) >>> csv.decimal '.'
A TSV containing integers and floats using , as decimal separator with a single header line:
>>> from io import StringIO >>> file = StringIO('''a,b ... 0\t0,0 ... 1\t1,0''') >>> csv = BaseLoader(file) >>> # csv.data.readlines()[1].strip().split(csv.delimiter) >>> csv.decimal ','
Data rows containing both ‘.’ and ‘,’:
>>> from io import StringIO >>> file = StringIO('''a\tb\ttext ... 0.0\t0.0\ta,b ... 1.0\t1.0\tc,d''') >>> csv = BaseLoader(file) >>> csv.decimal '.'
Data rows containing both ‘.’ and ‘,’ in the values:
>>> from io import StringIO >>> file = StringIO('''a\tb\ttext ... 0.1\t0,0\ta,b ... 1.1\t1,0\tc,d''') >>> csv = BaseLoader(file) >>> csv.decimal Traceback (most recent call last): ... ValueError: Decimal separator could not be determined. Found both ',' and '.' in numeric values in a single data line.
Implementation in a specific device loader:
>>> from io import StringIO >>> file = StringIO('''EC-Lab ASCII FILE ... Nb header lines : 6 ... ... Device metadata : some metadata ... ... mode\ttime/s\tEwe/V\t<I>/mA\tcontrol/V ... 2\t0\t0,1\t0\t0 ... 2\t1\t1,4\t5\t1 ... ''') >>> csv = BaseLoader.create('eclab')(file) >>> csv.decimal ','
- property delimiter
The delimiter in the CSV, which is extracted from the first two lines of the CSV data.
A CSV containing integers:
>>> from io import StringIO >>> file = StringIO('''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.delimiter ','
A CSV containing floats:
>>> from io import StringIO >>> file = StringIO('''a,b ... 0.0,0.0 ... 1.0,1.0''') >>> csv = BaseLoader(file) >>> csv.delimiter ','
A TSV containing floats with a single header line:
>>> from io import StringIO >>> file = StringIO('''a\tb ... 0\t0.0 ... 1\t1.0''') >>> csv = BaseLoader(file) >>> csv.delimiter '\t'
A TSV with three columns containing floats using , as decimal separator with a single header line:
>>> from io import StringIO >>> file = StringIO('''a\tb\tc ... 0,0\t0,0\t0,0 ... 1,1\t1,0\t0,0''') >>> csv = BaseLoader(file) >>> csv.delimiter '\t'
A TSV with two columns containing floats using , as decimal separator with a single header line:
>>> from io import StringIO >>> file = StringIO('''a\tb ... 0,0\t0,0 ... 1,1\t1,0''') >>> csv = BaseLoader(file) >>> csv.delimiter '\t'
A TSV containing integers and floats using , as decimal separator with a single header line:
>>> from io import StringIO >>> file = StringIO('''a\tb ... 0\t0,0 ... 1\t1,0 ... ''') >>> csv = BaseLoader(file) >>> csv.delimiter '\t'
A rather messy file:
>>> from io import StringIO >>> file = StringIO(('''# I am messy data ... Random stuff ... maybe metadata : 3 ... in different formats = abc123 ... hopefully, some information ... on where the data block starts! ... t\tE\tj ... s\tV\tA/cm2 ... 0\t0\t0 ... 1\t1\t1 ... 2\t2\t2 ... ''')) >>> csv = BaseLoader(file) >>> csv.delimiter '\t'
- property df
A pandas dataframe of the data in the CSV.
EXAMPLES:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.df a b 0 0 0 1 1 1
- property file
A file like object of the loaded file.
- EXAMPLES::
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> type(csv.file) <class '_io.StringIO'>
- property header
The header of the CSV (excluding column names).
EXAMPLES:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> type(csv.header) <class '_io.StringIO'>
EXAMPLES:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.header.readlines() []
- property header_lines
The number of header lines in a CSV excluding the line with the column names.
EXAMPLES:
Files for the base loader do not have a header:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.header_lines 0
Implementation in a specific device loader:
>>> file = StringIO('''EC-Lab ASCII FILE ... Nb header lines : 6 ... ... Device metadata : some metadata ... ... mode\ttime/s\tEwe/V\t<I>/mA\tcontrol/V ... 2\t0\t0,1\t0\t0 ... 2\t1\t1,4\t5\t1 ... ''') >>> csv = BaseLoader.create('eclab')(file) >>> csv.header_lines 5
- property metadata
A dict containing the metadata of the file found in its header.
EXAMPLES:
>>> from io import StringIO >>> file = StringIO(r'''a,b ... 0,0 ... 1,1''') >>> csv = BaseLoader(file) >>> csv.metadata Traceback (most recent call last): ... NotImplementedError