Customize

本区段是一份"MinT 训练任务里能配置的全部内容"的参考。它按算法（SFT / DPO / RL / VLA）组织，加上贯穿多个算法的概念和 recipe。每个叶子页声明 3 种固定模板之一，所以右侧 TOC 在整个区段里读起来一致。

从哪开始

VLA

OpenPI vision-language-action，经 SDK 或 HTTP。

全部参数（按算法）

每个算法的完整参数表在它自己页面的 ## All Parameters 段。直达链接：

算法	参数页
SFT	`sft/index#all-parameters`
SFT 超参数 recipe	`sft/hyperparameters`
DPO	`dpo/index#all-parameters`
RL — Math	`rl/math-rl#all-parameters`
RL — Chat	`rl/chat-rl#all-parameters`
RL — Code	`rl/code-rl#all-parameters`
RL 超参数 recipe	`rl/hyperparameters`
VLA	`vla#all-parameters`

概念索引

跨多个算法都会出现的基础话题：

话题	页面
渲染 / 分词	Rendering
损失函数目录	Loss Functions
Completers（TokenCompleter、MessageCompleter、LLM-as-judge）	Completers
权重 / checkpoints / TTL	Weights
Evaluations（自定义、NLL、Inspect AI）	Evaluations
异步模式 / `num_samples`	Async Patterns

Recipe 索引

把多个原语组合起来的端到端 recipe，以及部署路径：

Recipe	页面
RLHF 三阶段流水线	RLHF Pipeline
多轮 RL	Multi-turn
多智能体 RL	Multi-agent
Prompt 蒸馏	Distillation
自定义环境	Custom Environment
导出到 HF	Export to HF
LoRA Adapter	LoRA Adapter
发布到 Hub	Publish to Hub

Tinker 兼容。 MinT 客户端 SDK 与 Tinker API 兼容（pip install 来自 mindlab-toolkit）。Tinker 上能跑的代码模式 MinT 也能跑 —— 把 endpoint 换成 mint.macaron.xin 或 mint-cn.macaron.xin，把 key 换成 MINT_API_KEY 即可。完整迁移见 Human Quickstart → 从 Tinker 迁移。

Customize

从哪开始

SFT 概览

DPO 概览

RL 概览

VLA

全部参数（按算法）

概念索引

Recipe 索引

本页目录