mlfoundations-cua-dev/grpo-7b-stage-3-on-103k-filtered-data-temp-2-2-zero-correct-to-0.2-no-pixmo-uground-seeclick
8B
•
Updated
•
7
mlfoundations-cua-dev/grpo-7b-stage-3-on-103k-filtered-data-temp-2-1-0p2-show-ui-ui-vision-jedi-gta-dense-reward
8B
•
Updated
•
6
mlfoundations-cua-dev/grpo-7b-stage-3-on-103k-filtered-data-temp-1-8-0p2-show-ui-ui-vision-jedi-gta-dense-reward
8B
•
Updated
•
6
mlfoundations-cua-dev/grpo-7b-stage-3-on-103k-filtered-data-temp-2-0-0p2-show-ui-ui-vision-jedi-gta-dense-reward
8B
•
Updated
•
4
mlfoundations-cua-dev/grpo-7b-stage-2-on-103k-filtered-data-temp-1-7-zero-correct-to-0.2
8B
•
Updated
•
4
mlfoundations-cua-dev/grpo-7b-stage-3-on-103k-filtered-data-temp-2-1-0p2-zero-only-show-ui-ui-vision-jedi-gta-n-16
8B
•
Updated
•
5
mlfoundations-cua-dev/grpo-7b-stage-1-on-103k-dense-reward-step-60
8B
•
Updated
•
6
mlfoundations-cua-dev/grpo-7b-stage-1-on-103k-dense-reward-step-40
8B
•
Updated
•
4
mlfoundations-cua-dev/grpo-7b-stage-1-on-103k-dense-reward-step-20
8B
•
Updated
•
7
mlfoundations-cua-dev/grpo-7b-stage-2-on-103k-filtered-data-temp-1-7-zero-correct-to-0.3
8B
•
Updated
•
2
mlfoundations-cua-dev/rpo-7b-stage-2-on-103k-filtered-data-temp-1-4-zero-correct-to-0.2
8B
•
Updated
•
5
mlfoundations-cua-dev/grpo-7b-stage-1-on-103k-filtered-data
8B
•
Updated
•
8
mlfoundations-cua-dev/grpo_colstart_3_3k_on_63k_dynamic_batching-bs_128_8nodes-dense_reward_step_220
8B
•
Updated
•
8
mlfoundations-cua-dev/grpo_colstart_3_3k_on_63k_dynamic_batching-bs_128_8nodes-dense_reward_step_200
8B
•
Updated
•
9
mlfoundations-cua-dev/grpo_colstart_3_3k_on_63k_dynamic_batching-bs_128_8nodes-dense_reward_step_160
8B
•
Updated
•
10
mlfoundations-cua-dev/grpo_colstart_3_3k_on_63k_dynamic_batching-bs_128_8nodes-dense_reward_step_120
8B
•
Updated
•
10
mlfoundations-cua-dev/grpo_colstart_3_3k_on_63k_dynamic_batching-bs_128_8nodes-dense_reward_step_80
8B
•
Updated
•
8
mlfoundations-cua-dev/grpo_colstart_3_3k_on_63k_dynamic_batching-bs_128_8nodes-dense_reward_step_40
8B
•
Updated
•
7
mlfoundations-cua-dev/grpo_colstart_3_3k_on_63k_dynamic_batching-bs_128_8nodes-dense_reward
8B
•
Updated
•
7
mlfoundations-cua-dev/qwen2_5vl_7b_model_soup_4x_uniform_add_103k_sft
8B
•
Updated
•
8
mlfoundations-cua-dev/qwen2_5vl_7b_easyr1_103k_4MP_jedi_ui_vision_gta1_data_lr_1_0e-06_z3_4nodes
Image-to-Text
•
849k
•
Updated
•
10
mlfoundations-cua-dev/grpo-coldstart-63k-on-63k-max-prompt-5200-dynamic-batching-bs_128_8nodes
Updated
mlfoundations-cua-dev/grpo-coldstart-63k-on-63k-max-prompt-5200-dynamic-batching-bs_128_8nodes-ui-venus-params
Updated
mlfoundations-cua-dev/grpo-coldstart-10k-on-63k-max-prompt-5200-dynamic-batching-bs_128_8nodes
Updated
mlfoundations-cua-dev/grpo-coldstart-1k-on-63k-max-prompt-5200-dynamic-batching-bs_128_8nodes
Updated
mlfoundations-cua-dev/grpo-7b-stage-2-on-103k-filtered-data-temp-1-4-zero-correct-to-0.2
Updated
mlfoundations-cua-dev/qwen2_5vl_7b_model_soup_4x_uniform_add_best_rl
8B
•
Updated
•
6
mlfoundations-cua-dev/qwen2_5vl_7b_easyr1_38k_lr_1_0e-06_bs16_4nodes
Image-to-Text
•
849k
•
Updated
•
3
mlfoundations-cua-dev/qwen2_5vl_7b_easyr1_38k_lr_1_0e-06_bs16_4nodes_2epochs
Image-to-Text
•
849k
•
Updated
•
4
mlfoundations-cua-dev/coldstart_10k_from_44k_qwen2_5vl_7b_ui_vision_grounding_4MP_lr_1_0e-06_bs16_4nodes
Image-to-Text
•
849k
•
Updated
•
7