-
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Paper • 2412.14475 • Published • 55 -
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 52 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
WavePulse: Real-time Content Analytics of Radio Livestreams
Paper • 2412.17998 • Published • 11
Collections
Discover the best community collections!
Collections including paper arxiv:2509.00375
-
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Paper • 2509.06283 • Published • 17 -
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
Text Generation • 31B • Updated • 11.6k • 764 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 71 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68
-
Robix: A Unified Model for Robot Interaction, Reasoning and Planning
Paper • 2509.01106 • Published • 48 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 21 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 89 -
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling
Paper • 2508.17445 • Published • 80 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 41 -
VibeVoice Technical Report
Paper • 2508.19205 • Published • 123
-
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Paper • 2412.14475 • Published • 55 -
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 52 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
WavePulse: Real-time Content Analytics of Radio Livestreams
Paper • 2412.17998 • Published • 11
-
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Paper • 2509.06283 • Published • 17 -
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
Text Generation • 31B • Updated • 11.6k • 764 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 71 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 21 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
Robix: A Unified Model for Robot Interaction, Reasoning and Planning
Paper • 2509.01106 • Published • 48 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23
-
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 89 -
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling
Paper • 2508.17445 • Published • 80 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 41 -
VibeVoice Technical Report
Paper • 2508.19205 • Published • 123