39 196 48

KABI

dongguanting

https://dongguanting.github.io/

AI & ML interests

Reasoning and Alignment for Large Language Models

Recent Activity

upvoted a paper about 19 hours ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

upvoted a paper about 23 hours ago

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

liked a model 12 days ago

dongguanting/QwQ-32B-AEPO-DeepSearch

View all activity

Organizations

upvoted a paper about 19 hours ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 4 days ago • 67

upvoted a paper about 23 hours ago

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

Paper • 2601.03822 • Published 1 day ago • 20

liked a model 12 days ago

dongguanting/QwQ-32B-AEPO-DeepSearch

Text Generation • 33B • Updated 19 days ago • 13 • 1

upvoted a paper 13 days ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 85

upvoted a paper 18 days ago

Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

Paper • 2512.17260 • Published 21 days ago • 48

liked a model 19 days ago

dongguanting/Qwen3-8B-AEPO-DeepSearch

Text Generation • 8B • Updated 19 days ago • 21 • 2

updated 2 models 19 days ago

dongguanting/Qwen3-8B-AEPO-DeepSearch

Text Generation • 8B • Updated 19 days ago • 21 • 2

dongguanting/QwQ-32B-AEPO-DeepSearch

Text Generation • 33B • Updated 19 days ago • 13 • 1

updated a collection 19 days ago

AEPO

Collection

The official datasets and model checkpoints of AEPO • 5 items • Updated 19 days ago • 4

updated a model 19 days ago

dongguanting/QwQ-32B-ARPO-DeepSearch

33B • Updated 19 days ago • 9 • 1

updated a collection 19 days ago

ARPO

Collection

The official datasets and model checkpoints of ARPO • 10 items • Updated 19 days ago • 6

upvoted a paper 24 days ago

Memory in the Age of AI Agents

Paper • 2512.13564 • Published 24 days ago • 132

published 2 models 24 days ago

dongguanting/QwQ-32B-ARPO-DeepSearch

33B • Updated 19 days ago • 9 • 1

dongguanting/QwQ-32B-AEPO-DeepSearch

Text Generation • 33B • Updated 19 days ago • 13 • 1

upvoted 2 papers 24 days ago

Thinking with Images via Self-Calling Agent

Paper • 2512.08511 • Published about 1 month ago • 21

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 28 days ago • 46

upvoted 3 papers about 1 month ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 283

Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published Nov 25, 2025 • 117

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 60

KABI

AI & ML interests

Recent Activity

Organizations

dongguanting's activity