Model Lineup
MinT lists models by availability status.
Available Models
| Model Name | Training Type | Architecture | Size | Context |
|---|---|---|---|---|
| Qwen/Qwen3-0.6B | Hybrid | Dense | Tiny | 32k |
| Qwen/Qwen3-4B-Instruct-2507 | Instruction | Dense | Compact | 32k |
| Qwen/Qwen3-30B-A3B-Instruct-2507 | Instruction | MoE | Medium | 32k |
| Qwen/Qwen3-235B-A22B-Instruct-2507 | Instruction | MoE | Large | 32k |
| moonshotai/Kimi-K2-Instruct* | Instruction | MoE | Large | 32k |
| zai-org/GLM-5* | Reasoning | MoE | Large | 32k |
*Contact sales for Kimi-K2 and GLM5 access
Coming Soon
| Model Name | Training Type | Architecture | Size | Context |
|---|---|---|---|---|
| Qwen/Qwen3-30B-A3B | Hybrid | MoE | Medium | 32k |
| Qwen/Qwen3-30B-A3B-Base | Base | MoE | Medium | 32k |
| Qwen/Qwen3-8B | Hybrid | Dense | Small | 32k |
| Qwen/Qwen3-8B-Base | Base | Dense | Small | 32k |
| deepseek-ai/DeepSeek-V3.1 | Hybrid | MoE | Large | 32k |
| deepseek-ai/DeepSeek-V3.1-Base | Base | MoE | Large | 32k |
| Qwen/Qwen3-VL-30B-A3B-Instruct | Vision | MoE | Medium | 32k |
| Qwen/Qwen3-VL-235B-A22B-Instruct | Vision | MoE | Large | 32k |
| π0 | Robotics | Dense | Small | 32k |
Model Selection Recommendations
- Low-latency: Qwen3-0.6B or Qwen3-4B-Instruct-2507
- Balanced quality: Qwen3-30B-A3B-Instruct-2507
- Frontier scale: Qwen3-235B-A22B-Instruct-2507
Model Categories
By Training Type
- Hybrid - Mixed general + instruction behavior
- Instruction - Fine-tuned for instruction following
By Architecture
- Dense - Traditional transformer architecture
- MoE (Mixture of Experts) - Sparse activation for efficiency
By Size
- Tiny: under 1B parameters
- Compact: 1-4B parameters
- Medium: ~30B parameters
- Large: 200B+ parameters
Cost Efficiency
MoE models offer superior cost-efficiency: pricing scales with active parameters, not total model size. For example, a 235B MoE model with 22B active parameters costs the same as a 22B dense model.