-
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Paper • 2509.22576 • Published • 133 -
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping
Paper • 2509.21880 • Published • 52 -
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Paper • 2510.09116 • Published • 95
kishore
vamsi2655
AI & ML interests
None yet
Recent Activity
updated
a collection
about 1 month ago
Rl
updated
a collection
about 2 months ago
Rl
updated
a model
6 months ago
vamsi2655/Enlighten_Instruct
Organizations
None yet