SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals Paper • 2502.01042 • Published Feb 3 • 1
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence Paper • 2507.21046 • Published Jul 28 • 81
UserBench: An Interactive Gym Environment for User-Centric Agents Paper • 2507.22034 • Published Jul 29 • 29
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering Paper • 2509.09614 • Published Sep 11 • 7
UserRL: Training Interactive User-Centric Agent via Reinforcement Learning Paper • 2509.19736 • Published Sep 24 • 11
NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning Paper • 2307.08941 • Published Jul 18, 2023 • 1
Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond Paper • 2403.10667 • Published Mar 15, 2024 • 1
SelfElicit: Your Language Model Secretly Knows Where is the Relevant Evidence Paper • 2502.08767 • Published Feb 12
Can Vision Language Models Infer Human Gaze Direction? A Controlled Study Paper • 2506.05412 • Published Jun 4 • 4
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance Paper • 2506.06444 • Published Jun 6 • 73
CREATOR: Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation Paper • 2305.14318 • Published May 23, 2023
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents Paper • 2402.09205 • Published Feb 14, 2024
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance Paper • 2410.12361 • Published Oct 16, 2024
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Paper • 2502.09560 • Published Feb 13 • 35
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper • 2503.01935 • Published Mar 3 • 29
AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset Paper • 2504.03612 • Published Apr 4 • 2