Usage
Command-line interface
The mdstools CLI provides commands for converting between YAML and tabular formats.
Flatten YAML to Excel/CSV
Convert a nested YAML metadata file to a tabular format:
mdstools flatten tests/example_metadata.yaml --output-dir generated
Or via pixi:
pixi run flatten tests/example_metadata.yaml --output-dir generated
Options:
--output-dir— Output directory (default:generated/)--schema— Schema name for enrichment (adds descriptions and examples)--format— Output format:xlsx,csv, ormd
Unflatten Excel/CSV to YAML
Convert a tabular file back to nested YAML:
mdstools unflatten generated/output.xlsx
Or via pixi:
pixi run unflatten generated/output.xlsx
Update metadata to a newer schema version
Migrate a metadata file (or data package) from an older schema version to a
newer one. Without --in-place this is a dry run that reports the migration
steps per document:
mdstools update path/to/metadata.yaml
Apply the migration, rewriting the file in place (YAML comments and layout are preserved):
mdstools update path/to/metadata.yaml --in-place
Or via pixi:
pixi run update path/to/metadata.yaml --in-place
Options:
--to-version— Target schema version (default: the installed version).--in-place— Write the migrated file back in place. Without it,updateonly reports what would change.
Data packages are handled per resource: each resources[].metadata.<key> block
is migrated using its own echemdbSchemaVersion.
Python API
Basic flattening
from mdstools.metadata.metadata import Metadata
data = {"curation": {"process": [{"role": "curator", "name": "Jane Doe"}]}}
metadata = Metadata(data)
flattened = metadata.flatten()
flattened.to_excel("output.xlsx")
With schema enrichment
from mdstools.metadata.metadata import Metadata
from mdstools.metadata.enriched_metadata import EnrichedFlattenedMetadata
metadata = Metadata(data)
flattened = metadata.flatten()
enriched = EnrichedFlattenedMetadata(flattened.rows, schema_dir="schemas")
enriched.to_excel("output.xlsx")
Multi-sheet Excel export
enriched.to_excel("output.xlsx", separate_sheets=True)
Schema validation
from mdstools.schema.validator import validate_with_json_schema, validate_with_pydantic
# JSON Schema validation
validate_with_json_schema(data, schema_name="minimum_echemdb")
# Pydantic validation
validate_with_pydantic(data, schema_name="minimum_echemdb")
Migrating metadata across schema versions
from mdstools.schema.migrate import MetadataMigrator, migrate_file
# In memory: migrate a dict, then validate the result against a target schema
migrated = MetadataMigrator(data, target_version="latest").migrated()
MetadataMigrator(data).validate("minimum_echemdb")
# From a file (returns the migrated dict; pass in_place=True to overwrite,
# preserving YAML comments)
migrated = migrate_file("metadata.yaml", in_place=True)
Only breaking schema changes need a migration step; additive changes are
backward-compatible. Steps are declared in mdstools/schema/migrations.py.