-
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization
Paper • 2508.14460 • Published • 82 -
MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
Paper • 2508.09670 • Published -
URPO: A Unified Reward & Policy Optimization Framework for Large Language Models
Paper • 2507.17515 • Published • 2
Emmanuel Sugutt
Sugutt
AI & ML interests
Reinforcement learning
Transformer models
Organizations
GUI Agents
-
LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark
Paper • 2504.13805 • Published • 11 -
Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models
Paper • 2503.16734 • Published • 1 -
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects
Paper • 2504.19838 • Published • 22
MoE
-
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Paper • 2508.07785 • Published • 28 -
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
Paper • 2508.05257 • Published • 13 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 56 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 92
RL
-
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization
Paper • 2508.14460 • Published • 82 -
MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
Paper • 2508.09670 • Published -
URPO: A Unified Reward & Policy Optimization Framework for Large Language Models
Paper • 2507.17515 • Published • 2
MoE
-
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Paper • 2508.07785 • Published • 28 -
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
Paper • 2508.05257 • Published • 13 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 56 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 92
GUI Agents
-
LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark
Paper • 2504.13805 • Published • 11 -
Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models
Paper • 2503.16734 • Published • 1 -
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects
Paper • 2504.19838 • Published • 22