MIS Rollout Correction

这一页对应 mint-quickstart 中的 advanced/validate_mis_rollout_correction.py。

这个验证能证明什么

create_model 接受 session-level Seq-MIS rollout_correction_config
随后的 forward_backward(..., loss_fn="importance_sampling") 不需要再次传 per-step rollout config 也能成功
返回结果中包含有效的 loss_fn_outputs
默认会清理掉临时创建的 model

这是一个很窄的集成验证，不是完整 RL 教程。

当前推荐路径

当前推荐使用 quickstart 验证脚本中的 direct request / Tinker-compatible request path。这里不会暗示已经存在高层 MinT SDK helper 专门处理 MIS rollout-correction wiring。

命令

export MINT_API_KEY=sk-...
export MINT_BASE_URL=<your-region-endpoint>
python advanced/validate_mis_rollout_correction.py --base-model Qwen/Qwen3-30B-A3B-Instruct-2507

按所在区域选择 MinT 域名：

境内：https://mint-cn.macaron.xin/
境外：https://mint.macaron.xin/

支持的输入

--base-url
--api-key
--base-model
--lora-rank
--mis-threshold
--create-timeout-s
--forward-backward-timeout-s
--poll-interval-s
--skip-cleanup

当 MinT 风格环境变量和 Tinker 兼容别名同时存在时，优先使用 MinT 风格：

MINT_BASE_URL 优先于 TINKER_BASE_URL
MINT_API_KEY 优先于 TINKER_API_KEY
MINT_BASE_MODEL 优先于 TINKER_MODEL

预期输出

[config] base_url=<your-region-endpoint> base_model=Qwen/Qwen3-30B-A3B-Instruct-2507 lora_rank=8 mis_threshold=1.1
[create_model] submitted session_id=validate-mis-1234abcd
[create_model] resolved model_id=model_...
[forward_backward] submitted model_id=model_... loss_fn=importance_sampling
[forward_backward] resolved outputs=1
PASS: MIS rollout_correction request succeeded and response was valid
[cleanup] deleted model_id=model_...

常见失败

FAIL [config]：缺少 API key
FAIL [create_model]：模型创建被拒绝、超时，或该模型对当前账号不可用
FAIL [forward_backward]：模型创建成功后训练请求失败
FAIL [malformed_response]：服务端返回里没有 loss_fn_outputs
[cleanup] warning：验证完成了，但 best-effort 删除失败

状态口径

记录这个流程的结果时，使用：

CONFIRMED：远端 MinT 端到端验证通过
PARTIAL：文档和脚本已具备，但存在 server-side blocker
INSUFFICIENT_DATA：文档和脚本已具备，但还没有足够可靠的运行证据

兼容性状态说明见 Tinker 兼容性。

本页目录