Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

448

Full-text search

Active filters: rlhf

mradermacher/distilabeled-Hermes-2.5-Mistral-7B-GGUF

7B • Updated Dec 16, 2024 • 19 • 1

mradermacher/distilabeled-Hermes-2.5-Mistral-7B-i1-GGUF

7B • Updated Dec 16, 2024 • 114 • 1

mradermacher/CapybaraHermes-2.5-Mistral-7B-i1-GGUF

7B • Updated Nov 15, 2024 • 83 • 1

mradermacher/ToxicHermes-2.5-Mistral-7B-GGUF

7B • Updated Nov 16, 2024 • 119

mradermacher/ToxicHermes-2.5-Mistral-7B-i1-GGUF

7B • Updated Nov 16, 2024 • 201

mradermacher/OrpoLlama-3-8B-GGUF

8B • Updated Nov 17, 2024 • 45

mradermacher/OrpoLlama-3-8B-i1-GGUF

8B • Updated Nov 17, 2024 • 107

tensorblock/Llama-3-70B-Orpo-v0.1-GGUF

71B • Updated 22 days ago • 26

hfc971/NeuralBeagle14-7B-GGUF

Updated Dec 14, 2024

arcticoneai/Arctic_AI

Reinforcement Learning • Updated Nov 9, 2025 • 26 • 2

tensorblock/distilabeled-Marcoro14-7B-slerp-full-GGUF

7B • Updated 22 days ago • 55

mradermacher/distilabeled-Marcoro14-7B-slerp-full-GGUF

7B • Updated Dec 19, 2024 • 32 • 1

tensorblock/NeuralMarcoro14-7B-GGUF

7B • Updated 22 days ago • 37

mradermacher/distilabeled-Marcoro14-7B-slerp-full-i1-GGUF

7B • Updated Dec 19, 2024 • 65 • 1

mradermacher/distilabeled-Marcoro14-7B-slerp-GGUF

7B • Updated Dec 19, 2024 • 49

mradermacher/pandora-7b-chat-GGUF

9B • Updated Dec 24, 2024 • 28

mradermacher/pandora-7b-chat-i1-GGUF

9B • Updated Jan 26, 2025 • 102

tensorblock/NeuralHermes-2.5-Mistral-7B-GGUF

7B • Updated 22 days ago • 57

tensorblock/archangel_sft-dpo_pythia2-8b-GGUF

3B • Updated 22 days ago • 35

tensorblock/archangel_sft_llama7b-GGUF

7B • Updated 22 days ago • 45

tensorblock/archangel_sft-kto_llama13b-GGUF

13B • Updated 22 days ago • 23

mradermacher/UpshotLlama-3-8B-GGUF

8B • Updated Jul 31, 2025 • 19

mradermacher/Llama-3-8B-Orpo-v0.1-GGUF

8B • Updated Jul 11, 2025 • 40

mradermacher/Llama-3-8B-Orpo-v0.1-i1-GGUF

8B • Updated Jul 11, 2025 • 58

ZeppelinCorp/Okamela

Text Generation • Updated Apr 14, 2025

bikmish/llm-course-hw2-dpo

0.1B • Updated Mar 29, 2025 • 1

mradermacher/beaver-7b-v2.0-GGUF

Reinforcement Learning • 7B • Updated Jul 11, 2025 • 411

mradermacher/beaver-7b-v3.0-GGUF

Reinforcement Learning • 7B • Updated Jul 11, 2025 • 88 • 1

mradermacher/beaver-7b-v1.0-GGUF

Reinforcement Learning • 7B • Updated Jul 11, 2025 • 243

loganlin777/mistral-7b-dpo-adapter

Updated Apr 27, 2025