armand0e commited on
Commit
07e8298
·
verified ·
1 Parent(s): 99678a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -15
README.md CHANGED
@@ -1,21 +1,32 @@
1
  ---
2
- base_model: nvidia/Nemotron-Orchestrator-8B
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - qwen3
8
- license: apache-2.0
9
- language:
10
- - en
11
  ---
 
12
 
13
- # Uploaded finetuned model
14
 
15
- - **Developed by:** TeichAI
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** nvidia/Nemotron-Orchestrator-8B
18
 
19
- This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ datasets:
3
+ - TeichAI/claude-4.5-opus-high-reasoning-250x
4
+ base_model:
5
+ - unsloth/Qwen3-8B-unsloth-bnb-4bit
 
 
 
 
 
6
  ---
7
+ # Nemotron Orchestrator 8B x Claude 4.5 Opus (High Reasoning) Distill
8
 
9
+ This model was trained on a **Claude Opus 4.5 (reasoning)** dataset with a high reasoning effort.
10
 
11
+ You are viewing the safetensors variant of this model, a quantized gguf variant is available here: [TeichAI/Nemotron-Orchestrator-8B-Claude-4.5-Opus-Distill-GGUF](https://huggingface.co/TeichAI/Nemotron-Orchestrator-8B-Claude-4.5-Opus-Distill-GGUF)
 
 
12
 
13
+ - &#129302; Related Models:
14
+ | Model | Effective parameters | Active parameters |
15
+ | ------------- | ------------- | ------------- |
16
+ | [`Qwen3-8B-Claude-4.5-Opus-High-Reasoning-Distill`](https://huggingface.co/TeichAI/Qwen3-8B-Claude-4.5-Opus-High-Reasoning-Distill) | 8 B | 8 B |
17
+ | [`Qwen3-4B-Thinking-2507-Claude-4.5-Opus-High-Reasoning-Distill`](https://huggingface.co/TeichAI/Qwen3-4B-Thinking-2507-Claude-4.5-Opus-High-Reasoning-Distill) | 4 B | 4 B |
18
 
19
+ - 🧬 Datasets:
20
+ - `TeichAI/claude-4.5-opus-high-reasoning-250x`
21
+
22
+ - 🏗 Base Model:
23
+ - `unsloth/Qwen3-8B-unsloth-bnb-4bit`
24
+
25
+ - &#9889; Use cases:
26
+ - Coding
27
+ - Science
28
+ - General Purpose
29
+
30
+ - &#8721; Stats (Dataset)
31
+ - Costs: $ 52.3 (USD)
32
+ - Total tokens (input + output): 2.13 M