Xinyu Zhu

TianHongZXY

https://zhuxinyu.top

AI & ML interests

Large Language Models; Reasoning; Reinforcement Learning

Recent Activity

published a model 3 days ago

TianHongZXY/Qwen3-4B-Thinking-2507-SFT-10-epochs-synthesized-clear-problems-global_step_280

updated a model 3 days ago

TianHongZXY/Qwen3-4B-Thinking-2507-SFT-10-epochs-synthesized-clear-problems-global_step_280

upvoted a paper about 1 month ago

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

View all activity

Organizations

published a model 3 days ago

TianHongZXY/Qwen3-4B-Thinking-2507-SFT-10-epochs-synthesized-clear-problems-global_step_280

0.5B • Updated 3 days ago • 2

updated a model 3 days ago

TianHongZXY/Qwen3-4B-Thinking-2507-SFT-10-epochs-synthesized-clear-problems-global_step_280

0.5B • Updated 3 days ago • 2

upvoted a paper about 1 month ago

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30 • 54

authored a paper about 1 month ago

RAST: Reasoning Activation in LLMs via Small-model Transfer

Paper • 2506.15710 • Published May 30

updated a dataset 2 months ago

TianHongZXY/similar_problems_with_three_in_context_problems

Viewer • Updated Sep 4 • 2.16k • 2.42k

published a dataset 2 months ago

TianHongZXY/similar_problems_with_three_in_context_problems

Viewer • Updated Sep 4 • 2.16k • 2.42k

upvoted a paper 2 months ago

A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25 • 342

updated a dataset 2 months ago

TianHongZXY/Top_5_similar_question-NVIDIA-OpenScienceReasoning-2

Viewer • Updated Aug 28 • 2.16k • 4.83k

published a dataset 2 months ago

TianHongZXY/Top_5_similar_question-NVIDIA-OpenScienceReasoning-2

Viewer • Updated Aug 28 • 2.16k • 4.83k

liked 2 datasets 3 months ago

cais/hle

Viewer • Updated Sep 10 • 2.5k • 12.1k • 508

nvidia/OpenScienceReasoning-2

Viewer • Updated Jul 31 • 803k • 880 • 46

liked a model 3 months ago

Qwen/Qwen3-235B-A22B-Thinking-2507

Text Generation • 235B • Updated Aug 17 • 36.2k • • 377

liked a dataset 3 months ago

nvidia/Nemotron-Post-Training-Dataset-v1

Viewer • Updated Aug 25 • 25.7M • 11.8k • 159

upvoted a collection 3 months ago

RLVR-Decomposed

Collection

The collection for the Paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning" • 9 items • Updated Jun 1 • 2

updated a model 3 months ago

TianHongZXY/Qwen2.5-Math-7B-GRPO

8B • Updated Jul 28 • 7

updated a model 4 months ago

TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-Math-7B-RoPE-40K-GRPO-use_guide-clip_ratio_upper_0.28

Updated Jul 12

published a model 4 months ago

TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-Math-7B-RoPE-40K-GRPO-use_guide-clip_ratio_upper_0.28

Updated Jul 12

updated 2 models 4 months ago

TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-7B-Instruct-GRPO-clip_0.28

Updated Jul 8

TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-7B-Instruct-GRPO-gpt-4o-summary_wo_think-clip_0.28

Updated Jul 8

published a model 4 months ago

TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-7B-Instruct-GRPO-clip_0.28

Updated Jul 8

Xinyu Zhu

AI & ML interests

Recent Activity

Organizations

TianHongZXY's activity