Kai Yang's picture

1 2 2

Kai Yang

yangkaiSIGS

·

https://yk7333.github.io/

yk7333

AI & ML interests

None yet

Recent Activity

updated a Space 4 days ago

yangkaiSIGS/entropic

authored a paper 21 days ago

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners

authored a paper 21 days ago

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

View all activity

Organizations

authored 9 papers 21 days ago

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners

Paper • 2509.26226 • Published Sep 30 • 32

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

Paper • 2511.15248 • Published 23 days ago • 6

Exploration and Anti-Exploration with Distributional Random Network Distillation

Paper • 2401.09750 • Published Jan 18, 2024

A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation

Paper • 2407.00496 • Published Jun 29, 2024

BATON: Aligning Text-to-Audio Model with Human Preference Feedback

Paper • 2402.00744 • Published Feb 1, 2024

Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning

Paper • 2412.15517 • Published Dec 20, 2024

Exploration by Random Distribution Distillation

Paper • 2505.11044 • Published May 16

Novelty-based Sample Reuse for Continuous Robotics Control

Paper • 2410.13490 • Published Oct 17, 2024

CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning

Paper • 2406.07541 • Published Jun 11, 2024

authored a paper about 2 years ago

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Paper • 2311.13231 • Published Nov 22, 2023 • 29