Mixtrain is a post-training platform for specialized multimodal models. Curate and manage datasets, fine-tune open-source and proprietary models, and run evaluations across image, video, audio, 3D, and text.
Post-training — fine-tuning, RLHF, distillation, and evaluation on top of foundation models — is how you turn a general model into one that works for your domain.
The Model Lifecycle
- Curate - Build datasets from images, video, audio, 3D, and text. Version, visualize, and query with SQL.
- Train - Fine-tune models on your data. Run on GPUs with checkpointing and distributed training.
- Evaluate - Compare outputs side-by-side across models. Catch regressions before shipping.
- Deploy - Run models in production with workflows and routing across providers.
- Iterate - Feed evaluation results back into your datasets and retrain.
Open Source Models
Mixtrain supports fine-tuning open-source models across specialized domains:
- Robotics - GR00T N1.6, pi0.5, SmolVLA
- Vision - SmolVLM 2, SigLIP 2, SAM 2.1
- Audio & Speech - Parakeet TDT v3, Canary Qwen, Whisper
Next Steps
- Quickstart - Install the SDK and get started
- Authentication - Configure API keys and credentials
- Datasets - Manage your training and eval data
- Models - Train, fine-tune, and run models
- Examples - See complete working examples