Zilin Zhu's picture

Zilin Zhu

zhuzilin

·

zhuzilin

AI & ML interests

MLSys

Recent Activity

upvoted a paper 16 days ago

IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction

liked a model 4 months ago

openai/gpt-oss-120b

updated a dataset 4 months ago

zhuzilin/dapo-math-17k

View all activity

Organizations

None yet

upvoted a paper 16 days ago

IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction

Paper • 2511.07327 • Published 16 days ago • 72

upvoted 2 papers 5 months ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 238

LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Paper • 2506.18841 • Published Jun 23 • 56

upvoted a paper 7 months ago

Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

Paper • 2504.15843 • Published Apr 22 • 17

upvoted a paper 9 months ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113

upvoted 4 papers 11 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 99

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3 • 47

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 104

upvoted a paper 12 months ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 84

upvoted 2 papers about 1 year ago

POINTS: Improving Your Vision-language Model with Affordable Strategies

Paper • 2409.04828 • Published Sep 7, 2024 • 24

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

Paper • 2408.16532 • Published Aug 29, 2024 • 50

upvoted 3 papers over 1 year ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 44

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Paper • 2403.10704 • Published Mar 15, 2024 • 59

upvoted a collection over 1 year ago

Awesome SFT datasets

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 145