Human Quickstart

You will go from zero to a trained model on a remote MinT server in seven steps. All training runs against mint.macaron.xin (or mint-cn.macaron.xin from mainland China). You do not need a local GPU.

1. Install MinT

Install the MinT client SDK from the public toolkit repo. Python 3.11+ is required.

pip install git+https://github.com/MindLab-Research/mindlab-toolkit.git

This installs mint and pulls a compatible tinker>=0.15.0 plus runtime helpers. After install, import mint patches the Tinker key validator so MinT keys (sk-*) work without further configuration.

If you already have code written against the Tinker client surface, the lowest-friction migration is to alias the import:

import mint as tinker

Keep the rest of your Tinker code unchanged and set these environment variables:

export TINKER_BASE_URL=https://mint.macaron.xin/        # or mint-cn for mainland China
export TINKER_API_KEY=$MINT_API_KEY

Why this works: raw upstream import tinker still validates keys with the tml- prefix, but MinT keys start with sk-. import mint as tinker applies MinT's compatibility patches while preserving familiar Tinker call shapes. Do not call zero_grad_async() in MinT training loops — gradient zeroing is handled automatically by the server.

2. Set your API key

export MINT_API_KEY=sk-your-api-key-here
export MINT_BASE_URL=https://mint.macaron.xin/        # mainland China: mint-cn.macaron.xin

A .env file in your project root works too if your script loads it.

3. Pass in your data

MinT trains on a stream of mint.types.Datum objects. Each Datum is a token sequence plus per-token loss weights. The standard helper for chat-style data:

import mint
from mint import types

def process_sft_example(example: dict, tokenizer) -> types.Datum:
    # example = {"prompt": "...", "response": "..."}
    prompt_ids = tokenizer.encode(example["prompt"])
    response_ids = tokenizer.encode(example["response"])
    return types.Datum(
        model_input_ids=prompt_ids + response_ids,
        loss_weights=[0.0] * len(prompt_ids) + [1.0] * len(response_ids),
    )

For RL, you sample first then construct Datum objects with reward-derived advantages — see RL Overview.

4. Choose a model

Qwen/Qwen3-0.6B is the lightweight default — fast iteration, ideal for first-run sanity checks. For real workloads use one of:

Model	Use case
`Qwen/Qwen3-0.6B`	Fast iteration, smoke tests
`Qwen/Qwen3-30B-A3B-Instruct-2507`	Mid-scale chat / instruction following
`Qwen/Qwen3-235B-A22B-Instruct-2507`	Large-scale instruction tuning
`Qwen/Qwen3-235B-A22B-Thinking-2507`	Reasoning / chain-of-thought

See Supported Models for the full table.

5. Choose an algorithm

If you have...	Use	MinT path
Labeled prompt → response targets	SFT	`loss_fn="cross_entropy"`
Pairs of preferred vs rejected responses	DPO	preference loss via `forward_backward_custom`
A reward, verifier, or environment	RL (GRPO)	`loss_fn="importance_sampling"`

If you have both supervised data and a reward, you can run SFT then RL — that is what the canonical quickstart.py script does.

6. Start training

Minimal SFT example:

import mint
from mint import types

service_client = mint.ServiceClient()
training_client = service_client.create_lora_training_client(
    base_model="Qwen/Qwen3-0.6B",
    rank=16,
    train_mlp=True,
    train_attn=True,
    train_unembed=True,
)

# data: list[types.Datum] from step 3
adam_params = types.AdamParams(learning_rate=5e-5)

for step, batch in enumerate(batches_of(data, batch_size=8)):
    fb_future = training_client.forward_backward(batch, loss_fn="cross_entropy")
    optim_future = training_client.optim_step(adam_params)
    fb_result = fb_future.result()
    optim_future.result()
    print(f"step={step} loss={fb_result.loss}")

The training runs on the remote MinT server. Your script blocks on .result() calls; everything heavy happens elsewhere.

7. Start sampling

After training, sample from your trained LoRA:

sampling_client = training_client.save_weights_and_get_sampling_client(name="my-run-v1")

prompt_ids = tokenizer.encode("3 * 7 =")
samples = sampling_client.sample(
    prompt=types.ModelInput.from_ints(prompt_ids),
    sampling_params=types.SamplingParams(max_tokens=16, temperature=0.7),
    num_samples=4,
)
for s in samples.sequences:
    print(tokenizer.decode(s.tokens))

For sampling logs and richer evaluation patterns, see Concepts → Evaluations.

What's next?

SFT Overview — datasets, renderers, completers, distillation
DPO Overview — preference pairs, β tuning
RL Overview — GRPO, custom rewards, environments
Customize Overview — full parameter cross-reference

Troubleshooting. If _require_api_key() raises or the preflight times out, check MINT_API_KEY and MINT_BASE_URL are set, and that you can reach mint.macaron.xin (or mint-cn from mainland China). For other issues see FAQ.

On this page