Limits and Quotas
Session Timeout
Sampling sessions have a 30-minute inactivity timeout. If no sampling requests are made for 30 minutes, the session expires.
Create a new sampling client to continue:
sampling_client = training_client.save_weights_and_get_sampling_client()Rate Limiting
When rate limited, the API returns HTTP 429 with RateLimitError. Implement exponential backoff:
import asyncio
from mint import RateLimitError
async def call_with_backoff(fn, max_retries=5):
for attempt in range(max_retries):
try:
return await fn()
except RateLimitError:
if attempt == max_retries - 1:
raise
wait_time = 2 ** attempt
await asyncio.sleep(wait_time)Best Practices for Long Training Runs
For RL training or other long-running workloads:
- Create fresh sampling client each batch - Avoids session timeout issues
- Complete sampling within 25 minutes - Leave buffer before 30-minute timeout
- Implement retry logic - Handle transient
RequestFailedErrorwith exponential backoff - Monitor request_id - Save for debugging if errors persist
for batch in range(num_batches):
sampling_client = training_client.save_weights_and_get_sampling_client()
for step in range(max_steps):
result = await sample_with_retry(sampling_client, prompt, params)
await training_client.forward_backward_async(...)
await training_client.optim_step_async(...)