RLVE - a hamishivi Collection

hamishivi 's Collections

RLVE

Large-Scale Data Selection for Instruction Tuning

TESS 2

Tulu 2 Llama 3 Update

LM Preference Datasets

RLVE

updated 6 days ago

Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published 7 days ago • 12
hamishivi/OpenThinker3-1.5B-RLVE

Text Generation • 2B • Updated 7 days ago • 55 • 1
hamishivi/Nemotron-Research-Reasoning-Qwen-1.5B-v2-RLVE

Text Generation • 2B • Updated 7 days ago • 42 • 1