rl-research/DR-Tulu-8B
8B
•
Updated
•
677
•
49
Models and data associated with DR Tulu, http://allenai-web/papers/drtulu
Note Final RLER-trained model.
Note SFT model.
Note Data used for SFT training.
Note Data used for RL training.