Saving and Loading
MinT provides flexible methods for saving and loading model weights and optimizer states.
Save Methods
Save Weights for Sampler
Stores weights optimized for inference (faster):
sampling_path = training_client.save_weights_for_sampler(name="0000").result().pathSave Full State
Preserves both weights and optimizer state for resuming training:
resume_path = training_client.save_state(name="0010").result().pathCreating a Sampling Client
import mint
service_client = mint.ServiceClient()
training_client = service_client.create_lora_training_client(
base_model="Qwen/Qwen3-4B-Instruct-2507", rank=32
)
sampling_path = training_client.save_weights_for_sampler(name="0000").result().path
sampling_client = service_client.create_sampling_client(model_path=sampling_path)Resuming Training
resume_path = training_client.save_state(name="0010").result().path
# Full-state resume (weights + optimizer state)
training_client.load_state_with_optimizer(resume_path)
# Weights-only restore (optimizer state is not restored)
training_client.load_state(resume_path)Use Cases
- Multi-stage workflows - Save intermediate checkpoints
- Hyperparameter adjustments - Resume from a checkpoint with new settings
- Failure recovery - Restore training after interruptions
- Optimizer state preservation - Maintain training momentum across sessions
Uploading Custom Checkpoints
You can upload your own LoRA checkpoints to continue training:
import mint
service_client = mint.ServiceClient()
# Upload a tar.gz checkpoint archive
with open("my_checkpoint.tar.gz", "rb") as f:
result = service_client.upload_checkpoint(f).result()
checkpoint_id = result.checkpoint_id
# Create training client from uploaded checkpoint
training_client = service_client.create_training_client_from_state(
checkpoint_path=checkpoint_id
)Checkpoint format requirements:
- Must be a
.tar.gzarchive - Should contain LoRA adapter files (e.g.,
adapter_model.safetensors,adapter_config.json) - Metadata files are optional but recommended
Downloading Checkpoints
# List checkpoints for a training run
checkpoints = service_client.list_checkpoints(model_id=training_client.model_id).result()
# Download as tar.gz archive
archive_bytes = service_client.download_checkpoint(checkpoint_id).result()
with open("downloaded_checkpoint.tar.gz", "wb") as f:
f.write(archive_bytes)