rubricreward/arena-human-preference-tgt_prompt_tgt_thinking-filtered_correct Viewer • Updated Sep 2 • 61.1k • 11
rubricreward/arena-human-preference-tgt_prompt_en_thinking-filtered_correct Viewer • Updated Sep 2 • 60.7k • 9
rubricreward/arena-human-preference-en_prompt_en_thinking-filtered_correct Viewer • Updated Sep 2 • 60.3k • 12
rubricreward/PolyGuardMix-filtered-tgt_prompt_tgt_thinking-filtered_correct Viewer • Updated Sep 1 • 624k • 10
rubricreward/PolyGuardMix-filtered-tgt_prompt_en_thinking-filtered_correct Viewer • Updated Sep 1 • 631k • 10
rubricreward/PolyGuardMix-filtered-en_prompt_en_thinking-filtered_correct Viewer • Updated Sep 1 • 638k • 11
rubricreward/HumanEval-XL-Python-tgt_prompt_tgt_thinking-filtered_correct Viewer • Updated Aug 31 • 3.1k • 11
rubricreward/MATH-500-Multilingual-tgt_prompt_tgt_thinking-filtered_correct Viewer • Updated Aug 31 • 4.57k • 11
rubricreward/HumanEval-XL-Python-tgt_prompt_en_thinking-filtered_correct Viewer • Updated Aug 31 • 3.2k • 10
rubricreward/MATH-500-Multilingual-tgt_prompt_en_thinking-filtered_correct Viewer • Updated Aug 31 • 4.83k • 11
rubricreward/HumanEval-XL-Python-en_prompt_tgt_thinking-filtered_correct Viewer • Updated Aug 31 • 3.09k • 11
rubricreward/MATH-500-Multilingual-en_prompt_tgt_thinking-filtered_correct Viewer • Updated Aug 31 • 4.59k • 11
rubricreward/arena-human-preference-en_prompt_tgt_thinking-filtered_correct Viewer • Updated Aug 31 • 51.4k • 8
rubricreward/HumanEval-XL-Python-en_prompt_en_thinking-filtered_correct Viewer • Updated Aug 31 • 3.25k • 11
rubricreward/MATH-500-Multilingual-en_prompt_en_thinking-filtered_correct Viewer • Updated Aug 31 • 4.83k • 11