OpenPI VLA HTTP
这一页对应 mint-quickstart 里的 demos/embodied/openpi_vla_http.py。
这个示例做什么
- 不走高层
mintxSDK helper,而是直接走 HTTP,把当前 OpenPI FAST 的 wire format 原样展示出来。 - 跑一轮最小 VLA 训练闭环:
create_session->create_model->train_step->save_weights_for_sampler->delete model。 - 明确展示
model_input和loss_fn_inputs这两个部分分别该放什么。 - 用三个 1x1 PNG 占位图保证脚本自包含,不依赖额外数据集。
请求形状
model_input.chunks
1. image: base_0_rgb
2. image: left_wrist_0_rgb
3. image: right_wrist_0_rgb
4. encoded_text: prefix_tokens
loss_fn_inputs
- state
- target_tokens
- weights
- token_ar_mask
- optional: logprobs + advantages运行方式
pip install httpx python-dotenv
export MINT_API_KEY=sk-...
python demos/embodied/openpi_vla_http.py按所在区域选择 MinT 域名:
- 境内:
https://mint-cn.macaron.xin/ - 境外:
https://mint.macaron.xin/
参数(环境变量)
MINT_API_KEY/TINKER_API_KEY:鉴权MINT_BASE_URL/TINKER_BASE_URL:服务地址MINT_OPENPI_HTTP_BASE_MODEL:默认openpi/pi0-fast-libero-low-mem-finetuneMINT_OPENPI_HTTP_LORA_RANK:默认16MINT_OPENPI_HTTP_LR:默认0.003MINT_OPENPI_HTTP_SAMPLER_PATH:默认mint-openpi-vla-http-exampleMINT_OPENPI_HTTP_CLIENT_TIMEOUT_SECONDS:默认120MINT_OPENPI_HTTP_FUTURE_TIMEOUT_SECONDS:默认1200
关键 payload 代码
def build_openpi_fast_datum_payload(
*,
prefix_tokens,
image_bytes_by_camera,
state,
target_tokens,
weights,
token_ar_mask,
logprobs=None,
advantages=None,
):
return {
"model_input": {
"chunks": [
{
"data": "<base64 png>",
"format": "png",
"expected_tokens": 256,
"type": "image",
},
{
"data": "<base64 png>",
"format": "png",
"expected_tokens": 256,
"type": "image",
},
{
"data": "<base64 png>",
"format": "png",
"expected_tokens": 256,
"type": "image",
},
{"tokens": list(prefix_tokens), "type": "encoded_text"},
]
},
"loss_fn_inputs": {
"state": {"data": list(state), "shape": [len(state)], "dtype": "float32"},
"target_tokens": {"data": list(target_tokens), "shape": [len(target_tokens)], "dtype": "int64"},
"weights": {"data": list(weights), "shape": [len(weights)], "dtype": "float32"},
"token_ar_mask": {"data": list(token_ar_mask), "shape": [len(token_ar_mask)], "dtype": "int32"},
},
}Future 轮询代码
def poll_future(client, *, request_id, timeout_seconds=None, sleep=time.sleep):
deadline = time.monotonic() + timeout_seconds
while True:
response = client.post("/api/v1/retrieve_future", json={"request_id": request_id})
if response.status_code == 408:
payload = response.json()
delay_seconds = _poll_delay_seconds(payload, response.headers)
if time.monotonic() + delay_seconds > deadline:
raise TimeoutError(...)
sleep(delay_seconds)
continue
response.raise_for_status()
return response.json()主流程
def run_example():
session = client.post("/api/v1/create_session", json=build_create_session_request())
create_future = client.post("/api/v1/create_model", json=build_create_model_request(...))
create_result = poll_future(client, request_id=create_future.json()["request_id"])
step_future = client.post("/api/v1/train_step", json=build_train_step_request(...))
step_result = poll_future(client, request_id=step_future.json()["request_id"])
sampler_future = client.post(
"/api/v1/save_weights_for_sampler",
json=build_save_weights_for_sampler_request(...),
)
sampler_result = poll_future(client, request_id=sampler_future.json()["request_id"])期望输出形状
脚本最后会打印一个 Python dict,包含这些 key:
session_id
model_id
model_info
train_step
sampler
models
delete_model重要说明
- 现在主推荐的 MinT SDK 路径已经放在
mint.mint/mintx;见 OpenPI VLA SDK。 - 这个例子故意保持 raw HTTP。它是在记录“当前真实请求形状”,而不是 SDK helper 这一层。
- 三个 1x1 PNG 只是占位符。真正接机器人或仿真器时,要替换成真实相机帧和真实 rollout tensor。
save_weights_for_sampler这一步说明训练后权重已经能导出,后续可接 sampling 或评估链路。
下一步
- 主 MinT SDK 用法见 OpenPI VLA SDK。
- 把
ONE_PIXEL_PNG替换成真实的base_0_rgb/ wrist camera 图像。 - 把 toy
state和target_tokens替换成真实 rollout 数据。 - 后续把保存出来的 sampler path 接到评估或 rollout 客户端里。