BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9 • 35
view article Article BigCodeArena: Judging code generations end to end with code executions By bigcode • Oct 7 • 17
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published Jun 17 • 49
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs Paper • 2502.19411 • Published Feb 26 • 2
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models Paper • 2404.05221 • Published Apr 8, 2024 • 1
RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems Paper • 2306.03091 • Published Jun 5, 2023 • 1
Rethinking Tabular Data Understanding with Large Language Models Paper • 2312.16702 • Published Dec 27, 2023 • 5