Praxis-VLM - a zhehuderek Collection

zhehuderek 's Collections

YesBut

Praxis-VLM

updated Sep 23

VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)

zhehuderek/textual_decisionmaking_data

Viewer • Updated Apr 9 • 11k • 9 • 1

Note This is the textual synthetic data we used for model training.
zhehuderek/praxis_vlm_7b_decisionmaking

Image-to-Text • 8B • Updated Jun 3 • 2
zhehuderek/praxis_vlm_3b_decisionmaking

Image-to-Text • 4B • Updated Jun 3 • 2
zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf

Image-to-Text • 4B • Updated Apr 9 • 1

Note This is the model checkpoint after cold-start math training using GEOQA-8K dataset.
zhehuderek/qwen2_5_vl_7b_GEOQA_8K_step90_hf

Image-to-Text • 8B • Updated Sep 21 • 3

Note This is the model checkpoint after cold-start math training using GEOQA-8K dataset.