Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2506.01939

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 46
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 50

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Reinforcement learning

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 75

Reasoning Models

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11 • 40
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 153
S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20 • 63

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 122
Evaluating and Aligning CodeLLMs on Human Preference

Paper • 2412.05210 • Published Dec 6, 2024 • 50
Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 48
Yi-Lightning Technical Report

Paper • 2412.01253 • Published Dec 2, 2024 • 28

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized

Latent Reasoning in LLMs as a Vocabulary-Space Superposition

Paper • 2510.15522 • Published Oct 17 • 1
Language Models are Injective and Hence Invertible

Paper • 2510.15511 • Published Oct 17 • 66
Eliciting Secret Knowledge from Language Models

Paper • 2510.01070 • Published Oct 1 • 4
Interpreting Language Models Through Concept Descriptions: A Survey

Paper • 2510.01048 • Published Oct 1 • 2

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 46
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 50

Reasoning Models

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11 • 40
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 153
S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20 • 63

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 122
Evaluating and Aligning CodeLLMs on Human Preference

Paper • 2412.05210 • Published Dec 6, 2024 • 50
Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 48
Yi-Lightning Technical Report

Paper • 2412.01253 • Published Dec 2, 2024 • 28

Reinforcement learning

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 75

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized

Latent Reasoning in LLMs as a Vocabulary-Space Superposition

Paper • 2510.15522 • Published Oct 17 • 1
Language Models are Injective and Hence Invertible

Paper • 2510.15511 • Published Oct 17 • 66
Eliciting Secret Knowledge from Language Models

Paper • 2510.01070 • Published Oct 1 • 4
Interpreting Language Models Through Concept Descriptions: A Survey

Paper • 2510.01048 • Published Oct 1 • 2

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs