Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

reinforcement-learning

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

70,529

Full-text search

Active filters: reinforcement-learning

PrimeIntellect/INTELLECT-3.1

Text Generation • 107B • Updated 5 days ago • 329 • 35

nvidia/GEAR-SONIC

Reinforcement Learning • Updated 3 days ago • 8

MuXodious/HER-32B-absolute-heresy

Text Generation • 33B • Updated 7 days ago • 60 • 8

zai-org/GLM-TTS

Text-to-Speech • Updated Jan 12 • 240 • 320

nvidia/NitroGen

Reinforcement Learning • Updated 17 days ago • 504

Adilbai/stock-trading-rl-agent

Reinforcement Learning • Updated Jan 8 • 107 • 107

AdityaaXD/Multi-Agent_Reinforcement_Learning_Trading_System_Models

Reinforcement Learning • Updated 21 days ago • 204 • 4

snap-stanford/humanlm-opinion

Text Generation • 8B • Updated 9 days ago • 118 • 9

JonusNattapong/AI-XAUUSD-Trading

Reinforcement Learning • Updated Oct 10, 2025 • 19

shiviktech/Trident

Text Generation • 4B • Updated Jan 7 • 1.5k • 4

Klingspor/StarPO-4B

Text Generation • 4B • Updated 9 days ago • 237 • 2

Snowflake/Arctic-AWM-4B

Reinforcement Learning • 4B • Updated 11 days ago • 93 • 5

Snowflake/Arctic-AWM-14B

Reinforcement Learning • 15B • Updated 11 days ago • 151 • 7

LightningRodLabs/Trump-Forecaster

Text Generation • Updated 6 days ago • 105 • 4

webxos/microclaw-for-openclaw-version-2026.2.17

Text Generation • Updated about 18 hours ago • 206 • 2

NousResearch/DeepHermes-Egregore-v1-RLAIF-8b-Atropos-GGUF

Reinforcement Learning • 8B • Updated May 5, 2025 • 45 • 4

TianheWu/VisualQuality-R1-7B

Reinforcement Learning • 8B • Updated Sep 19, 2025 • 28.8k • 10

ValueFX9507/Tifa-DeepsexV3-14b-GGUF-Q6

Reinforcement Learning • 15B • Updated Jul 1, 2025 • 657 • 39

PhysicsWallahAI/Aryabhata-1.0

Text Generation • 8B • Updated Aug 13, 2025 • 81 • 108

frankcholula/ppo-CarRacing-v3

Reinforcement Learning • Updated Aug 11, 2025 • 34 • 1

dbest-isi/searchless-chess-9M-selfplay

Reinforcement Learning • Updated Oct 23, 2025 • 22 • 2

JonusNattapong/Reinforcement-Learning-for-Gold-Trading-Model

Reinforcement Learning • Updated Dec 23, 2025 • 17 • 4

HumanPlane/LACUNA

Reinforcement Learning • Updated Jan 1 • 7 • 7

MistaIA/ppo-LunarLander-v3

Reinforcement Learning • Updated Jan 13 • 12 • 1

LightningRodLabs/future-as-label-paper-step160

Reinforcement Learning • 33B • Updated Jan 16 • 58 • 4

NurseCitizenDeveloper/NurseSim-Triage-Llama-3.2-3B

Reinforcement Learning • 3B • Updated 11 days ago • 25 • 1

MING-ZCH/MetaphorStar-3B

Image-Text-to-Text • 4B • Updated 9 days ago • 21 • 2

hkust-nlp/drkernel-8b

Text Generation • 8B • Updated 16 days ago • 118 • 4

hkust-nlp/drkernel-14b

Text Generation • 15B • Updated 16 days ago • 54 • 6

Snowflake/Arctic-AWM-8B

Reinforcement Learning • 8B • Updated 11 days ago • 30 • 4