YingzhePeng's picture

2 7 1

YingzhePeng

ColeYzzzz

·

https://github.com/ForJadeForest

ForJadeForest

AI & ML interests

NLP, Multimodal

Recent Activity

upvoted a paper about 1 month ago

Agent Learning via Early Experience

liked a model 3 months ago

YannQi/R-4B

upvoted a paper 3 months ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

View all activity

Organizations

upvoted a paper about 1 month ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 263

upvoted 2 papers 3 months ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28 • 109

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 178

upvoted a collection 4 months ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 174

upvoted a collection 8 months ago

LMM-R1

LMM-R1 model checkpoint and training data • 5 items • Updated Mar 13 • 2

upvoted a paper 8 months ago

LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10 • 88

upvoted a paper about 1 year ago

Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks

Paper • 2410.24032 • Published Oct 31, 2024 • 10