Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

449

Full-text search

Active filters: rlhf

ziadrone/airesupdated-v6

Text Generation • Updated Nov 5, 2025 • 2 • 1

Uppaal/gpt2-ProFS-toxicity

Text Generation • 0.4B • Updated Nov 9, 2025 • 13

Uppaal/gpt-j-ProFS-toxicity

Text Generation • 6B • Updated Nov 9, 2025 • 6

Uppaal/opt-ProFS-toxicity

Text Generation • 7B • Updated Nov 9, 2025 • 5

Uppaal/Mistral-ProFS-toxicity

Text Generation • 7B • Updated Nov 9, 2025 • 17

Uppaal/Mistral-sft-ProFS-toxicity

Text Generation • 7B • Updated Nov 9, 2025 • 10

Uppaal/Mistral-ProFS-safety

Text Generation • 7B • Updated Nov 9, 2025 • 5

Uppaal/Mistral-sft-ProFS-safety

Text Generation • 7B • Updated Nov 9, 2025 • 6

sodeniZz/llm-course-hw2-dpo

Text Generation • 0.1B • Updated Nov 15, 2025 • 1

sodeniZz/llm-course-hw2-reward-model

Text Classification • 0.1B • Updated Nov 15, 2025

sodeniZz/llm-course-hw2-ppo

Text Generation • 0.1B • Updated Nov 15, 2025 • 1

ahczhg/qwen3-0.6b-rlhf-cot

Text Generation • Updated Nov 17, 2025 • 1

ahczhg/Llama-3.2-1B-Aegis-SFT-DPO

Text Generation • 1B • Updated Nov 17, 2025 • 42 • 1

mradermacher/Llama-3.2-1B-Aegis-SFT-DPO-GGUF

1B • Updated Nov 15, 2025 • 62

khanhrill/HistoryGPT

4B • Updated Dec 12, 2025 • 8

mradermacher/HistoryGPT-GGUF

4B • Updated Dec 15, 2025 • 31

nfsrulesFR/mega-grpo

Text Generation • Updated Nov 22, 2025

TzJ2006/JokeGPT-Model

Updated Nov 29, 2025 • 10 • 1

FutureMa/Qwen2.5-7B-Instruct-GRPO-Math

Text Generation • Updated Nov 28, 2025

noeum/noeum-1-nano

Text Generation • Updated Jan 5 • 27

MaleekNoob/qwen3-0.6b-grpo-v1

Updated Dec 18, 2025

AhmedSSoliman/medgemma-4b-digital-twin-v1

Updated Dec 5, 2025

AhmedSSoliman/gpt-oss-20b-digital-twin-v1

Text Generation • Updated Dec 8, 2025 • 1

AhmedSSoliman/octomed-7b-digital-twin-v1

Text Generation • Updated Dec 9, 2025 • 1

AIPlans/Qwen3-0.6B-ReMax

Reinforcement Learning • 0.6B • Updated Dec 22, 2025 • 8 • 2

AIPlans/Qwen3-0.6B-IPO

Reinforcement Learning • 0.6B • Updated Dec 12, 2025 • 16

mradermacher/Qwen3-0.6B-ReMax-GGUF

Reinforcement Learning • 0.6B • Updated Dec 11, 2025 • 9

gyung/lfm2-1.2b-koen-mt-v5-rl-10k-adapter

Text Generation • Updated Dec 15, 2025 • 6 • 1

THU-KEG/WildReward-4B

Text Classification • 4B • Updated 16 days ago • 30 • 4

THU-KEG/WildReward-8B

Text Classification • 8B • Updated 16 days ago • 33 • 3