Model Lineup

MinT lists models by availability status.

Available Models

Model Name	Training Type	Architecture	Size	Context
Qwen/Qwen3-0.6B	Hybrid	Dense	Tiny	32k
Qwen/Qwen3-4B-Instruct-2507	Instruction	Dense	Compact	32k
Qwen/Qwen3-30B-A3B-Instruct-2507	Instruction	MoE	Medium	32k
Qwen/Qwen3-235B-A22B-Instruct-2507	Instruction	MoE	Large	32k
moonshotai/Kimi-K2-Instruct*	Instruction	MoE	Large	32k
zai-org/GLM-5*	Reasoning	MoE	Large	32k

*Contact sales for Kimi-K2 and GLM5 access

Coming Soon

Model Name	Training Type	Architecture	Size	Context
Qwen/Qwen3-30B-A3B	Hybrid	MoE	Medium	32k
Qwen/Qwen3-30B-A3B-Base	Base	MoE	Medium	32k
Qwen/Qwen3-8B	Hybrid	Dense	Small	32k
Qwen/Qwen3-8B-Base	Base	Dense	Small	32k
deepseek-ai/DeepSeek-V3.1	Hybrid	MoE	Large	32k
deepseek-ai/DeepSeek-V3.1-Base	Base	MoE	Large	32k
Qwen/Qwen3-VL-30B-A3B-Instruct	Vision	MoE	Medium	32k
Qwen/Qwen3-VL-235B-A22B-Instruct	Vision	MoE	Large	32k
π0	Robotics	Dense	Small	32k

Model Selection Recommendations

Low-latency: Qwen3-0.6B or Qwen3-4B-Instruct-2507
Balanced quality: Qwen3-30B-A3B-Instruct-2507
Frontier scale: Qwen3-235B-A22B-Instruct-2507

Model Categories

By Training Type

Hybrid - Mixed general + instruction behavior
Instruction - Fine-tuned for instruction following

By Architecture

Dense - Traditional transformer architecture
MoE (Mixture of Experts) - Sparse activation for efficiency

By Size

Tiny: under 1B parameters
Compact: 1-4B parameters
Medium: ~30B parameters
Large: 200B+ parameters

Cost Efficiency

MoE models offer superior cost-efficiency: pricing scales with active parameters, not total model size. For example, a 235B MoE model with 22B active parameters costs the same as a 22B dense model.

Async and Futures Limits and Quotas