Submitted by Zihao1 46 Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Tencent 170 7
Submitted by AnnieFeng 31 VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Amazon Web Services 2
Submitted by ProKil 9 Real-Time Reasoning Agents in Evolving Environments Social And Language Technology Lab 11 2
Submitted by taesiri 6 Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings · 8 authors 2
Submitted by JiayuJeff 3 CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration? · 10 authors 2