Reinforcement Learning - a mutoy Collection

mutoy 's Collections

Models

Reinforcement Learning

Reinforcement Learning

updated Sep 20

TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

Paper • 2508.04324 • Published Aug 6 • 11
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8 • 85
FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18 • 111