Create Dataset
mixtrain dataset create <name> <file>Create a dataset from a file.
Supported formats: .parquet, .csv, .tsv
Options:
| Option | Description |
|---|---|
--description, -d | Dataset description |
mixtrain dataset create training-data data.parquet
mixtrain dataset create eval-set data.csv --description "Evaluation data"List Datasets
mixtrain dataset listList all datasets in the workspace.
mixtrain dataset listOutput:
| Name | Rows | Created |
|---------------|---------|------------|
| training-data | 10,000 | 2024-01-15 |
| eval-set | 500 | 2024-01-16 |Query Dataset
mixtrain dataset query <name> [query]Interactive TUI browser with search capabilities.
Controls:
Ctrl+F- Searchq- Quit
mixtrain dataset query training-data # Browse all
mixtrain dataset query training-data "SELECT * WHERE score > 0.8"
mixtrain dataset query training-data "SELECT id, text LIMIT 50"View Metadata
mixtrain dataset metadata <name>Display dataset schema and metadata.
mixtrain dataset metadata training-dataOutput:
Dataset: training-data
Schema:
├── id: long
├── text: string
└── score: doubleDelete Dataset
mixtrain dataset delete <name>Delete a dataset.
Options:
| Option | Description |
|---|---|
--yes, -y | Skip confirmation |
mixtrain dataset delete old-dataset
mixtrain dataset delete old-dataset --yes