50 16 149

Dohyung Kim PRO

werty1248

werty1248

AI & ML interests

werty1248@gmail.com

Recent Activity

liked a model 4 days ago

ByteDance/Ouro-2.6B

liked a model 7 days ago

yanolja/YanoljaNEXT-Rosetta-4B-2511

liked a dataset 12 days ago

Jackrong/Natural-Reasoning-gpt-oss-120B-S1

View all activity

Organizations

upvoted 2 papers about 2 months ago

No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

Paper • 2509.21880 • Published Sep 26 • 52

Thinking Augmented Pre-training

Paper • 2509.20186 • Published Sep 24 • 23

upvoted 2 papers 3 months ago

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published Aug 20 • 82

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published Aug 14 • 28

upvoted a collection 7 months ago

Qwen3

Collection

84 items • Updated Aug 6 • 1.43k

upvoted a paper 7 months ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 136

upvoted a paper 10 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 425

upvoted a paper 11 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

upvoted a paper about 1 year ago

Exploring Model Kinship for Merging Large Language Models

Paper • 2410.12613 • Published Oct 16, 2024 • 21

upvoted an article about 1 year ago

Article

Fixing Gradient Accumulation

Oct 16, 2024

•

upvoted 3 papers about 1 year ago

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining

Paper • 2410.08102 • Published Oct 10, 2024 • 21

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Paper • 2408.16293 • Published Aug 29, 2024 • 27

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 56

upvoted a paper over 1 year ago

From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty

Paper • 2407.06071 • Published Jul 8, 2024 • 7

upvoted an article over 1 year ago