Rendering
Rendering converts a list of messages into a token sequence that a model can consume. While similar to HuggingFace chat templates, MinT's rendering system handles the full training lifecycle: supervised learning, reinforcement learning, and deployment. The renderer sits between your high-level conversation data and the low-level tokens the model sees.
Concept
A renderer maps between two worlds: structured message dicts (with roles like "user", "assistant", "system") and flat token sequences. During SFT, you use build_supervised_example() to separate prompt tokens (masked from loss) and completion tokens (where the model learns). During RL and sampling, you use build_generation_prompt() to construct a prompt, then parse_response() to decode the model's sampled tokens back into a message.
Renderers are model-family-specific. Qwen3, Llama 3, DeepSeek V3, and other models each have their own special tokens and format conventions. MinT ships renderers for the major families; each one returns the correct chat template:
Messages (list of dicts) --> Renderer --> Token IDs (list of ints)
+ loss masks (for SFT)
+ stop tokens (for sampling)Pattern
Here is the canonical pattern for rendering a conversation:
import mint
from mint import types
# Initialize renderer and tokenizer
service_client = mint.ServiceClient()
training_client = service_client.create_lora_training_client(
base_model="Qwen/Qwen3-0.6B",
rank=16,
)
tokenizer = training_client.get_tokenizer()
renderer = mint.renderers.get_renderer("qwen3", tokenizer)
# Build a conversation
messages = [
{"role": "system", "content": "Answer concisely; at most one sentence per response"},
{"role": "user", "content": "What is the longest-lived rodent species?"},
{"role": "assistant", "content": "The naked mole rat, which can live over 30 years."},
{"role": "user", "content": "How do they live so long?"},
{
"role": "assistant",
"content": "They evolved protective mechanisms including special hyaluronic acid that prevents cancer, efficient DNA repair systems, and extremely stable proteins.",
},
]
# For SFT: build tokens + loss weights
model_input, weights = renderer.build_supervised_example(messages)
# For sampling: build a prompt for the model to complete
prompt = renderer.build_generation_prompt(messages[:-1]) # Exclude last assistant message
stop_sequences = renderer.get_stop_sequences() # When to stop sampling
# Parse sampled tokens back into a message
sampled_tokens = [45, 7741, 34651, 31410] # (in practice, from model.sample)
message, success = renderer.parse_response(sampled_tokens)View full source: https://github.com/MindLab-Research/mint-quickstart/blob/main/concepts/rendering.py
API Surface
| Method | Purpose | Returns |
|---|---|---|
build_supervised_example(messages, train_on_what) | Convert messages to tokens + loss weights for SFT | (ModelInput, weights_array) |
build_generation_prompt(messages) | Convert messages to a prompt for model completion | ModelInput |
get_stop_sequences() | Tokens that signal end-of-generation | list[int | str] |
parse_response(tokens) | Convert sampled tokens back to a message | (dict, bool) success flag |
Renderers (use get_renderer(name, tokenizer))
"qwen3"— Qwen3 with thinking enabled (default)"qwen3_disable_thinking"— Qwen3 without thinking"llama3"— Llama 3"deepseekv3"— DeepSeek V3 (non-thinking)"deepseekv3_thinking"— DeepSeek V3 (thinking mode)"nemotron3"— NVIDIA Nemotron 3"kimi_k2"— Kimi K2
Vision renderers (for VLMs like Qwen3-VL):
from mint.image_processing_utils import get_image_processor
image_processor = get_image_processor("Qwen/Qwen3-VL-235B-A22B-Instruct")
renderer = mint.renderers.get_renderer("qwen3_vl_instruct", tokenizer, image_processor=image_processor)Caveats & Pitfalls
- Renderer mismatch: Always use
get_renderer()with the correct model family. A Llama3 renderer applied to Qwen3 tokens will produce incorrect special tokens. - Vision encoding: VL models require both a tokenizer and an image processor. Missing the image processor will cause shape mismatches on vision inputs.
- Loss masking: By default,
build_supervised_example()trains only on the final assistant message. Usetrain_on_what=TrainOnWhat.ALL_ASSISTANT_MESSAGESto train on all assistant turns. - Stop tokens:
get_stop_sequences()returns token IDs specific to the model family. For sampling, always pass these toSamplingParams(stop=...). - Custom formats: For unlisted model families, use
register_renderer()to define your own renderer before training. An unregistered format will silently use the wrong special tokens.