Customize
This section is the reference for everything you can configure in a MinT training job. It is organized by algorithm (SFT, DPO, RL, VLA) plus cross-cutting concepts and recipes. Every leaf page declares one of three fixed templates so the right-side TOC reads the same way across the section.
Where to start
SFT Overview
Supervised fine-tuning with the cross_entropy loss.
DPO Overview
Preference optimization via forward_backward_custom.
RL Overview
GRPO with verifiable / preference / environment rewards.
VLA
OpenPI vision-language-action via SDK or HTTP.
All parameters by algorithm
The full parameter table for each algorithm lives on its own page, in the ## All Parameters section. Quick links:
| Algorithm | Parameters page |
|---|---|
| SFT | sft/index#all-parameters |
| SFT hyperparameter recipe | sft/hyperparameters |
| DPO | dpo/index#all-parameters |
| RL — Math | rl/math-rl#all-parameters |
| RL — Chat | rl/chat-rl#all-parameters |
| RL — Code | rl/code-rl#all-parameters |
| RL hyperparameter recipe | rl/hyperparameters |
| VLA | vla#all-parameters |
Concepts index
Foundational topics that show up across multiple algorithms:
| Topic | Page |
|---|---|
| Rendering / tokenization | Rendering |
| Loss functions catalog | Loss Functions |
| Completers (TokenCompleter, MessageCompleter, LLM-as-judge) | Completers |
| Weights / checkpoints / TTL | Weights |
| Evaluations (custom, NLL, Inspect AI) | Evaluations |
Async patterns / num_samples | Async Patterns |
Recipes index
End-to-end recipes that combine multiple primitives, plus deployment paths:
| Recipe | Page |
|---|---|
| RLHF 3-stage pipeline | RLHF Pipeline |
| Multi-turn RL | Multi-turn |
| Multi-agent RL | Multi-agent |
| Prompt distillation | Distillation |
| Custom environment | Custom Environment |
| Export to HF | Export to HF |
| LoRA Adapter | LoRA Adapter |
| Publish to Hub | Publish to Hub |
Tinker compatibility. MinT's client SDK is Tinker-API compatible (pip install from mindlab-toolkit). Code patterns that work for Tinker work for MinT — substitute the endpoint (mint.macaron.xin or mint-cn.macaron.xin) and the MINT_API_KEY. See Human Quickstart → Migrating from Tinker for the full one-line migration.