Customize

This section is the reference for everything you can configure in a MinT training job. It is organized by algorithm (SFT, DPO, RL, VLA) plus cross-cutting concepts and recipes. Every leaf page declares one of three fixed templates so the right-side TOC reads the same way across the section.

Where to start

SFT Overview

Supervised fine-tuning with the cross_entropy loss.

DPO Overview

Preference optimization via forward_backward_custom.

RL Overview

GRPO with verifiable / preference / environment rewards.

VLA

OpenPI vision-language-action via SDK or HTTP.

All parameters by algorithm

The full parameter table for each algorithm lives on its own page, in the ## All Parameters section. Quick links:

Algorithm	Parameters page
SFT	`sft/index#all-parameters`
SFT hyperparameter recipe	`sft/hyperparameters`
DPO	`dpo/index#all-parameters`
RL — Math	`rl/math-rl#all-parameters`
RL — Chat	`rl/chat-rl#all-parameters`
RL — Code	`rl/code-rl#all-parameters`
RL hyperparameter recipe	`rl/hyperparameters`
VLA	`vla#all-parameters`

Concepts index

Foundational topics that show up across multiple algorithms:

Topic	Page
Rendering / tokenization	Rendering
Loss functions catalog	Loss Functions
Completers (TokenCompleter, MessageCompleter, LLM-as-judge)	Completers
Weights / checkpoints / TTL	Weights
Evaluations (custom, NLL, Inspect AI)	Evaluations
Async patterns / `num_samples`	Async Patterns

Recipes index

End-to-end recipes that combine multiple primitives, plus deployment paths:

Recipe	Page
RLHF 3-stage pipeline	RLHF Pipeline
Multi-turn RL	Multi-turn
Multi-agent RL	Multi-agent
Prompt distillation	Distillation
Custom environment	Custom Environment
Export to HF	Export to HF
LoRA Adapter	LoRA Adapter
Publish to Hub	Publish to Hub

Tinker compatibility. MinT's client SDK is Tinker-API compatible (pip install from mindlab-toolkit). Code patterns that work for Tinker work for MinT — substitute the endpoint (mint.macaron.xin or mint-cn.macaron.xin) and the MINT_API_KEY. See Human Quickstart → Migrating from Tinker for the full one-line migration.