VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
Derek Zhe Hu
zhehuderek
AI & ML interests
NLP, Multimodality
Recent Activity
liked
a dataset
1 day ago
builddotai/Egocentric-10K
liked
a dataset
about 1 month ago
nvidia/PBench
liked
a dataset
about 1 month ago
luckychao/EMMA
Organizations
None yet