Cookbook
lawbench
Self-contained MinT experiment for LawBench. Evaluates the official 20-task LawBench benchmark under one fixed local benchmark contract and keeps a maintained local execution baseline built around Qwen/Qwen3-4B-Instruct-2507 plus LoRA SFT. Not a paper-faithful reproduction of Qzhou-Law or DISC-LawLLM — the official scorer and benchmark contract stay fixed, but the maintained runnable line is a smaller local execution baseline.
At a glance
| Algorithm | LoRA SFT |
| Base model | Qwen/Qwen3-4B-Instruct-2507 |
| Training data | public DISC-Law-SFT train artifact |
| Benchmark | full 20-task LawBench (data/eval/full.jsonl, ~10,000 rows) |
| Primary metric | METRIC eval_lawbench_avg |
| Upstream README | Open in mint-cookbook → |
For setup, runnable commands, and full eval protocol, see the upstream README. The experiment follows the shared cookbook lifecycle: uv sync → --dry-run → --eval-only → train.