Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.24760

Robust Multimodal Large Language Models Against Modality Conflict

Paper • 2507.07151 • Published Jul 9 • 5
One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31
Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2 • 106
KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11 • 40

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 425
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

Paper • 2409.12576 • Published Sep 19, 2024 • 16
Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 172

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120
Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 63
Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27 • 26
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30 • 73

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

Paper • 2402.07754 • Published Feb 12, 2024
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models

Paper • 2505.10446 • Published May 15
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 92
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

Paper • 2505.16782 • Published May 22 • 1

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30 • 73

URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

Paper • 2501.04686 • Published Jan 8 • 53
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 41
LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 130
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding

Paper • 2502.19400 • Published Feb 26 • 48

Robust Multimodal Large Language Models Against Modality Conflict

Paper • 2507.07151 • Published Jul 9 • 5
One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31
Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2 • 106
KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11 • 40

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

Paper • 2402.07754 • Published Feb 12, 2024
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models

Paper • 2505.10446 • Published May 15
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 92
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

Paper • 2505.16782 • Published May 22 • 1

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 425
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

Paper • 2409.12576 • Published Sep 19, 2024 • 16
Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 172

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30 • 73

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120
Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 63
Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27 • 26
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30 • 73

URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

Paper • 2501.04686 • Published Jan 8 • 53
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 41
LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 130
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding

Paper • 2502.19400 • Published Feb 26 • 48

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs