YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Evo1 R1Pro Stage 1 Checkpoints

Stage 1 training checkpoints for Evo-1 model on R1Pro dataset (2025 Challenge Task 0006).

Training Details

Dataset: R1Pro 2025-challenge-demos-task0006

  • 200 episodes, 1.5M frames
  • 3 RGB camera views (head, left_wrist, right_wrist)
  • 24-dim state features (selected from 256-dim observation)

Model Configuration:

  • Base model: OpenGVLab/InternVL3-1B
  • Action head: Flow Matching
  • Image size: 448x448
  • Horizon: 50
  • Action dim: 24
  • State dim: 24

Stage 1 Training (Action Head Only):

  • Max steps: 5000
  • Batch size: 16
  • Learning rate: 1e-5
  • Dropout: 0.2
  • Weight decay: 1e-3
  • Warmup steps: 1000
  • Grad clip norm: 1.0
  • Trainable parameters: 122.03M (action head + integration module)
  • Frozen parameters: 648.69M (VLM)

Training Speed:

  • ~3.43 seconds/step with 4 DataLoader workers + GPU TorchCodec
  • Total training time: ~4.8 hours

Technology Stack:

  • PyTorch 2.8.0 (CUDA 12.9)
  • TorchCodec 0.8.1 (GPU-accelerated video decoding)
  • DeepSpeed ZeRO Stage 2
  • 4 DataLoader workers with spawn multiprocessing

Checkpoints

All checkpoints saved every 500 steps:

  • step_500 through step_5000 (regular checkpoints)
  • step_best - Best performing checkpoint
  • step_final - Final checkpoint

Each checkpoint contains:

  • mp_rank_00_model_states.pt - Model weights (~2.7GB)
  • bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt - Optimizer states (~1.4GB)
  • config.json - Model configuration
  • norm_stats.json - Normalization statistics

Usage

For Stage 2 training (full VLM finetuning), use:

--resume --resume_pretrain --resume_path step_best

State Indices

Selected 24 state features from 256-dim observation:

  • Left arm joints: 158-164 (7 dims)
  • Right arm joints: 197-203 (7 dims)
  • Left gripper: 193-194 (2 dims)
  • Right gripper: 232-233 (2 dims)
  • Trunk position: 236-239 (4 dims)
  • Trunk velocity: 240-241 (2 dims)

Training Loss

Initial: 0.6719 → Final: ~0.53 (decreasing trend)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support