Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
spankevich
's Collections
llm-hw-3
llm-hw-2
llm-hw-2
updated
Mar 9
collection of ppo, dpo and reward model
Upvote
1
spankevich/llm-hw-2-dpo
Text Generation
•
0.1B
•
Updated
Mar 9
•
7
spankevich/llm-hw-2-ppo
Text Generation
•
0.1B
•
Updated
Mar 9
•
11
spankevich/trainer_output
Text Classification
•
0.1B
•
Updated
Mar 9
•
17
Upvote
1
Share collection
View history
Collection guide
Browse collections