kaicheng001

https://kaicheng001.github.io/

kaicheng001

AI & ML interests

None yet

Recent Activity

upvoted an article 10 days ago

Visualize and understand GPU memory in PyTorch

upvoted an article 10 days ago

Chat Templates: An End to the Silent Performance Killer

upvoted a paper about 1 month ago

Agent Learning via Early Experience

View all activity

Organizations

None yet

upvoted 2 articles 10 days ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

•

250

Article

Chat Templates: An End to the Silent Performance Killer

Oct 3, 2023

•

upvoted a paper about 1 month ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 262

upvoted a paper about 2 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18 • 112

upvoted 2 papers 3 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 256

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 178

upvoted 4 papers 4 months ago

upvoted an article 4 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

721

upvoted 4 papers 6 months ago

Reward Reasoning Model

Paper • 2505.14674 • Published May 20 • 37

Parallel Scaling Law for Language Models

Paper • 2505.10475 • Published May 15 • 83

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 186

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8 • 185

upvoted an article 6 months ago

Article

I trained a Language Model to schedule events with GRPO!

Apr 29

•

upvoted a paper 7 months ago

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published Apr 16 • 48

upvoted an article 7 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

•

961

upvoted 2 papers 7 months ago

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published Apr 11 • 130

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 86

kaicheng001

AI & ML interests

Recent Activity

Organizations

kaicheng001's activity

Visualize and understand GPU memory in PyTorch

Chat Templates: An End to the Silent Performance Killer

SmolLM3: smol, multilingual, long-context reasoner

I trained a Language Model to schedule events with GRPO!

Mixture of Experts Explained