2 10 2

Siyu Yuan

siyuyuan

https://siyuyuan.github.io/

siyuyuan

AI & ML interests

Knowledge generation

Recent Activity

upvoted a paper 17 days ago

Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning

upvoted a paper 2 months ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

upvoted a paper 5 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

View all activity

Organizations

upvoted a paper 17 days ago

Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning

Paper • 2510.24320 • Published 17 days ago • 18

upvoted a paper 2 months ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Paper • 2509.06501 • Published Sep 8 • 78

upvoted 3 papers 5 months ago

liked a dataset 6 months ago

BytedTsinghua-SIA/Enigmata-Eval

Viewer • Updated May 27 • 4.76k • 488 • 2

authored a paper 6 months ago

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26 • 43

upvoted a paper 6 months ago

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26 • 43

commented a paper 6 months ago

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26 • 43 •

authored a paper 8 months ago

Implicit Reasoning in Transformers is Reasoning through Shortcuts

Paper • 2503.07604 • Published Mar 10 • 23

authored a paper 9 months ago

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

Paper • 2502.09082 • Published Feb 13 • 30

upvoted a paper 10 months ago

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published Jan 20 • 109

authored a paper 10 months ago

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published Jan 20 • 109

upvoted a paper 10 months ago

ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use

Paper • 2501.02506 • Published Jan 5 • 11

authored a paper 10 months ago

ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use

Paper • 2501.02506 • Published Jan 5 • 11

upvoted 2 papers about 1 year ago

Revealing the Barriers of Language Agents in Planning

Paper • 2410.12409 • Published Oct 16, 2024 • 27

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI

Paper • 2410.11623 • Published Oct 15, 2024 • 49

liked a Space about 2 years ago

Auction Arena

⚡

Siyu Yuan

AI & ML interests

Recent Activity

Organizations

siyuyuan's activity

Auction Arena