Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
mphielipp 's Collections
Computer Vision
Agentic RL
RL for Autoregressive Tasks
CUDA Optimization
Real2Sim2Real
LLM Training
Light TTS models
Datasets for Robotic Learning
Diffusion and RL
VLM
Visual Reasoning and LLMs
Diffusion Transformers
Robot Learning
Conditional Diffusion
SSMs and Diffusion
Grokking
Self Pedicting Learning in RL
LLMs Evaluation
CV
VLA

LLM Training

updated Aug 10
Upvote
-

  • LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

    Paper • 2403.13372 • Published Mar 20, 2024 • 168

  • On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

    Paper • 2508.05629 • Published Aug 7 • 178
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs