VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
Derek Zhe Hu
zhehuderek
AI & ML interests
NLP, Multimodality
Recent Activity
updated
a model
24 days ago
zhehuderek/wall-oss-fast-grab-roller
published
a model
24 days ago
zhehuderek/wall-oss-fast-grab-roller
liked
a dataset
about 1 month ago
nvidia/PBench
Organizations
None yet
YesBut
The collections of visual humor understanding and comparative reasoning.
-
zhehuderek/YESBUT_Benchmark
Viewer • Updated • 348 • 16 • 1 -
zhehuderek/YESBUT_Benchmark_V2
Viewer • Updated • 1.26k • 27 • 1 -
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Paper • 2405.19088 • Published -
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Paper • 2503.23137 • Published
Praxis-VLM
VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
YesBut
The collections of visual humor understanding and comparative reasoning.
-
zhehuderek/YESBUT_Benchmark
Viewer • Updated • 348 • 16 • 1 -
zhehuderek/YESBUT_Benchmark_V2
Viewer • Updated • 1.26k • 27 • 1 -
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Paper • 2405.19088 • Published -
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Paper • 2503.23137 • Published
models
7
zhehuderek/wall-oss-fast-grab-roller
4B
•
Updated
•
18
zhehuderek/qwen2_5_vl_7b_GEOQA_8K_step90_hf
Image-to-Text
•
8B
•
Updated
•
2
zhehuderek/praxis_vlm_7b_decisionmaking
Image-to-Text
•
8B
•
Updated
•
2
zhehuderek/praxis_vlm_3b_decisionmaking
Image-to-Text
•
4B
•
Updated
•
2
zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf
Image-to-Text
•
4B
•
Updated
•
1
zhehuderek/llama-2-7b-chinese
Text Generation
•
7B
•
Updated
•
3
zhehuderek/llama-3.1-8b-chinese-sft
Text Generation
•
8B
•
Updated
•
16
datasets
10
zhehuderek/VIVA_Plus_Benchmark
Viewer
•
Updated
•
6.37k
•
226
zhehuderek/OpenThoughts3-1.2M-processed
Viewer
•
Updated
•
39.6k
•
8
zhehuderek/humor_understanding_combined
Viewer
•
Updated
•
4.89k
•
16
•
1
zhehuderek/humor_understanding_nyt
Viewer
•
Updated
•
2.69k
•
11
zhehuderek/comparative_reasoning_mllm_compbench
Viewer
•
Updated
•
21.8k
•
9
zhehuderek/humor_understanding_deepeval
Viewer
•
Updated
•
2.96k
•
11
zhehuderek/textual_decisionmaking_data
Viewer
•
Updated
•
11k
•
10
•
1
zhehuderek/YESBUT_Benchmark_V2
Viewer
•
Updated
•
1.26k
•
27
•
1
zhehuderek/YESBUT_Benchmark
Viewer
•
Updated
•
348
•
16
•
1
zhehuderek/VIVA_Benchmark_EMNLP24
Viewer
•
Updated
•
1.24k
•
21