Mind Lab Toolkit (MinT)
Cookbook

dapo-aime

Self-contained MinT experiment that trains direct GRPO on a local materialization of BytedTsinghua-SIA/DAPO-Math-17k and reports on AIME 2024 as the fixed benchmark. AIME 2025 and AIME 2026 manifests ship under the same row contract for auxiliary eval. No SFT warm-start.

At a glance

Algorithmdirect GRPO (no SFT warm-start)
Base modelQwen/Qwen3-4B-Instruct-2507
Training datalocal materialization of BytedTsinghua-SIA/DAPO-Math-17k (data/train/full.jsonl)
BenchmarkAIME 2024 (data/eval/aime2024.jsonl); auxiliary: AIME 2025 / 2026
Primary metricsMETRIC eval_accuracy, METRIC eval_greedy_accuracy, METRIC eval_pass_at_k
Upstream READMEOpen in mint-cookbook →

For setup, runnable commands, and full eval protocol, see the upstream README. The experiment follows the shared cookbook lifecycle: uv sync--dry-run--eval-only → train.

On this page