Pi0.5 Fine-tuned on Franka Insert-Marker Task

A Pi0.5 model fine-tuned on a Franka Panda marker insertion task using the OpenPI framework.

Model Details

Property	Value
Base model	Pi0.5 DROID (pretrained)
Fine-tune config	`pi05_droid_finetune`
Training steps	5,000
Batch size	32
Action space	8D joint velocity (7 joints + 1 gripper position)
Action horizon	16 (chunk length)
Precision	bfloat16
Parameter size	~6.7 GB

Training Data

Fine-tuned on ankile/franka-insert-marker-v2-openpi - teleoperated demonstrations of a Franka Panda inserting a marker, collected via the DROID platform.

Observations

The model expects two camera views and proprioceptive state:

Key	Description
`observation/exterior_image_1_left`	Base camera (RGB, uint8, HWC)
`observation/wrist_image_left`	Wrist camera (RGB, uint8, HWC)
`observation/joint_position`	7D joint positions (float32)
`observation/gripper_position`	1D gripper position (float32)
`prompt`	Language task description

OpenPI handles image resizing (to 224x224), normalization, and tokenization internally.

Usage

Download and run inference with OpenPI

# Clone OpenPI
git clone https://github.com/Physical-Intelligence/openpi.git
cd openpi

# Download checkpoint from HF Hub
huggingface-cli download ankile/openpi-pi05-franka-insert-marker-v2-ft \
  --local-dir checkpoints/pi05_droid_finetune/pi05-insert-marker-v2-ft/4999

# Run inference server
uv run python scripts/serve_policy.py pi05_droid_finetune \
  --checkpoint-dir checkpoints/pi05_droid_finetune/pi05-insert-marker-v2-ft/4999

In-process inference (Python)

from openpi.training import config as openpi_config
from openpi.policies import policy_config as openpi_policy_config

train_config = openpi_config.get_config("pi05_droid_finetune")
policy = openpi_policy_config.create_trained_policy(
    train_config, "<path_to_downloaded_checkpoint>"
)

obs = {
    "observation/exterior_image_1_left": base_image,   # RGB uint8 HWC
    "observation/wrist_image_left": wrist_image,        # RGB uint8 HWC
    "observation/joint_position": joint_positions,      # float32 [7]
    "observation/gripper_position": gripper_position,   # float32 [1]
    "prompt": "insert the marker",
}
result = policy.infer(obs)
actions = result["actions"]  # shape [16, 8]

Training

Fine-tuned from the pretrained Pi0.5 DROID checkpoint (gs://openpi-assets/checkpoints/pi05_droid/params) using OpenPI's training pipeline:

uv run python scripts/train.py pi05_droid_finetune \
    --data.repo-id=ankile/franka-insert-marker-v2-openpi \
    --project-name=franka-insert-marker \
    --exp-name=pi05-insert-marker-v2-ft \
    --num-train-steps=5000 \
    --batch-size=32 \
    --save-interval=5000

Hardware: 1x NVIDIA H200 (141 GB), XLA_PYTHON_CLIENT_MEM_FRACTION=0.9.

Checkpoint Format

This checkpoint uses Orbax format (not safetensors/PyTorch). All parameters are stored in bfloat16.

├── _CHECKPOINT_METADATA
├── assets/
│   └── droid/
│       └── norm_stats.json
└── params/
    └── (orbax checkpoint files)

Citation

If you use this model, please cite the Pi0 and OpenPI papers:

@article{black2024pi0,
  title={$\pi_0$: A Vision-Language-Action Flow Model for General Robot Control},
  author={Black, Kevin and Brown, Noah and Driess, Danny and Esmail, Adnan and Equi, Michael and Finn, Chelsea and Fusai, Niccolo and Groom, Lachy and Hausman, Karol and Ichter, Brian and others},
  journal={arXiv preprint arXiv:2410.24164},
  year={2024}
}

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics

Dataset used to train ankile/openpi-pi05-franka-insert-marker-v2-ft

Paper for ankile/openpi-pi05-franka-insert-marker-v2-ft

π_0: A Vision-Language-Action Flow Model for General Robot Control

Paper • 2410.24164 • Published Oct 31, 2024 • 30