OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value Paper • 2512.14051 • Published 4 days ago • 35
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning Paper • 2510.04081 • Published Oct 5 • 23
GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models Paper • 2511.11134 • Published Nov 14 • 31