-
Robust Multimodal Large Language Models Against Modality Conflict
Paper • 2507.07151 • Published • 5 -
One Token to Fool LLM-as-a-Judge
Paper • 2507.08794 • Published • 31 -
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 106 -
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper • 2507.08799 • Published • 40
Collections
Discover the best community collections!
Collections including paper arxiv:2505.24760
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 425 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
Paper • 2409.12576 • Published • 16 -
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper • 2408.04619 • Published • 172
-
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 63 -
Reinforcing General Reasoning without Verifiers
Paper • 2505.21493 • Published • 26 -
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Paper • 2505.24760 • Published • 73
-
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Paper • 2402.07754 • Published -
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Paper • 2505.10446 • Published -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 92 -
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
Paper • 2505.16782 • Published • 1
-
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Paper • 2501.04686 • Published • 53 -
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 41 -
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper • 2411.10440 • Published • 130 -
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding
Paper • 2502.19400 • Published • 48
-
Robust Multimodal Large Language Models Against Modality Conflict
Paper • 2507.07151 • Published • 5 -
One Token to Fool LLM-as-a-Judge
Paper • 2507.08794 • Published • 31 -
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 106 -
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper • 2507.08799 • Published • 40
-
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Paper • 2402.07754 • Published -
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Paper • 2505.10446 • Published -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 92 -
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
Paper • 2505.16782 • Published • 1
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 425 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
Paper • 2409.12576 • Published • 16 -
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper • 2408.04619 • Published • 172
-
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 63 -
Reinforcing General Reasoning without Verifiers
Paper • 2505.21493 • Published • 26 -
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Paper • 2505.24760 • Published • 73
-
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Paper • 2501.04686 • Published • 53 -
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 41 -
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper • 2411.10440 • Published • 130 -
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding
Paper • 2502.19400 • Published • 48