llm-hw-2 - a spankevich Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

spankevich 's Collections

llm-hw-2

updated Mar 9, 2025

collection of ppo, dpo and reward model

spankevich/llm-hw-2-dpo

Text Generation • 0.1B • Updated Mar 9, 2025
spankevich/llm-hw-2-ppo

Text Generation • 0.1B • Updated Mar 9, 2025 • 1
spankevich/trainer_output

Text Classification • 0.1B • Updated Mar 9, 2025

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs