Mind Lab Toolkit (MinT)
CustomizeConcepts

Completers

Completers provide two levels of abstraction over the sampling client. TokenCompleter operates on token IDs and raw ModelInput objects — used by RL loops that work at the token level. MessageCompleter operates on message dicts with role and content — used by evaluators, LLM-as-judge patterns, and chat applications.

Concept

The sampling client returns raw token IDs. For most training and evaluation workflows, you need higher-level abstractions:

  • TokenCompleter — Input: ModelInput (tokens). Output: TokensWithLogprobs (tokens + log-probabilities). Used in RL rollouts, online reward collection, and token-level analysis.
  • MessageCompleter — Input: list[Message] (role + content). Output: Message (structured response). Used in evaluators, LLM judges, multi-turn interactions, and production inference.

Both completers handle:

  • Stop tokens — Automatically stops generation when a stop sequence is reached.
  • Sampling parameters — Temperature, top-p, max-tokens, and other standard LLM sampling controls.
  • Logprobs — For RL and importance weighting, you can retrieve log-probabilities alongside tokens.

Pattern

import mint
from mint.completers import TinkerTokenCompleter, TinkerMessageCompleter
from mint.renderers import get_renderer

service_client = mint.ServiceClient()
sampling_client = service_client.create_sampling_client(base_model="Qwen/Qwen3-0.6B")
tokenizer = sampling_client.get_tokenizer()
renderer = get_renderer("qwen3", tokenizer)

# Example 1: TokenCompleter for RL rollouts
token_completer = TinkerTokenCompleter(sampling_client=sampling_client)

prompt_ids = tokenizer.encode("The capital of France is")
prompt = mint.types.ModelInput.from_ints(prompt_ids)
sampling_params = mint.types.SamplingParams(
    max_tokens=16,
    temperature=0.7,
    stop=renderer.get_stop_sequences(),
)

token_result = token_completer.complete(
    prompt=prompt,
    sampling_params=sampling_params,
)
print(f"Tokens: {token_result.tokens}")
print(f"Logprobs: {token_result.logprobs}")

# Example 2: MessageCompleter for evaluation
message_completer = TinkerMessageCompleter(
    sampling_client=sampling_client,
    renderer=renderer,
)

messages = [
    {"role": "system", "content": "You are a math tutor."},
    {"role": "user", "content": "What is 7 * 8?"},
]

message_result = message_completer.complete(
    messages=messages,
    sampling_params=mint.types.SamplingParams(max_tokens=32, temperature=0.0),
)
print(f"Response: {message_result}")  # {"role": "assistant", "content": "..."}

View full source: https://github.com/MindLab-Research/mint-quickstart/blob/main/concepts/completers.py

API Surface

ClassInputOutputUse case
TinkerTokenCompleterModelInput (tokens)TokensWithLogprobsRL loops, token-level analysis
TinkerMessageCompleterlist[Message] (role/content)Message (role/content)Evaluation, LLM-as-judge, chat

Common parameters:

  • sampling_client — The underlying SamplingClient instance.
  • renderer (MessageCompleter only) — Handles message → tokens and tokens → message conversion.
  • sampling_paramsSamplingParams(max_tokens, temperature, top_p, stop, ...).

Return types:

  • TokensWithLogprobs — Namedtuple: (tokens: list[int], logprobs: list[float]).
  • Message — Dict: {"role": str, "content": str}.

Caveats & Pitfalls

  • Stop sequences: Always pass stop=renderer.get_stop_sequences() to prevent over-generation. Missing stop tokens can cause the model to continue past the intended message boundary.
  • MessageCompleter needs a renderer: A MessageCompleter without a renderer will fail on the first .complete() call. Always initialize with the correct renderer for your model family.
  • Logprobs cost: Requesting logprobs adds a small computational overhead. In large-scale RL, consider batching completions or filtering logprob retrieval to important trajectories.
  • Async variants: Use complete_async() for concurrent completions. Always gather futures before .result() to maximize throughput.
  • Sampler desync: After saving and reloading model weights, create a new SamplingClient and a new Completer. A stale completer silently samples from old weights.

On this page