File size: 9,157 Bytes
625123a 7d29d4a 625123a 62c75f2 625123a 62c75f2 625123a 62c75f2 625123a 6983911 625123a 6983911 625123a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 | ---
license: mit
language:
- en
library_name: transformers
tags:
- llama
- conversational
- text-generation
- from-scratch
- chain-of-thought
- reasoning
pipeline_tag: text-generation
model-index:
- name: Opus 1.5
results: []
---
# Opus 1.5
<div align="center">
<h3>π A 0.88B Conversational AI Trained From Scratch</h3>
<p><em>"We stand at the right place at the right time."</em> β Opus 1.5</p>
</div>
---
## π Highlights
- **Trained from scratch** - No pre-trained weights, 100% original
- **0.88 billion parameters** - Efficient LLaMA-style architecture
- **42 hours of training** - 2x RTX 4090 GPUs with FSDP
- **Created by teenagers** - Two AI enthusiasts (ages 15 & 17)
- **Chain-of-thought capable** - Experimental reasoning support
---
## Model Details
### Architecture
Opus 1.5 uses a modern LLaMA-style transformer architecture:
| Component | Implementation |
|-----------|----------------|
| Position Encoding | Rotary Position Embeddings (RoPE) |
| Activation | SwiGLU |
| Normalization | RMSNorm (pre-norm) |
| Attention | Grouped Query Attention (GQA) |
| Optimization | FlashAttention-2 compatible |
### Specifications
| Attribute | Value |
|-----------|-------|
| Hidden Size | 1536 |
| Layers | 24 |
| Attention Heads | 24 |
| KV Heads | 8 (3:1 GQA ratio) |
| Intermediate Size | 6144 |
| Vocab Size | 32,000 |
| Context Length | 1024 tokens |
| Total Parameters | 0.88B |
### πΎ Hardware Requirements
| Precision | VRAM Required | Tested On |
|-----------|---------------|-----------|
| bfloat16 | ~2 GB | RTX 4090 β
|
| float16 | ~2 GB | Any modern GPU |
| float32 | ~4 GB | Not recommended |
> **Note:** This model is very lightweight! It runs comfortably on consumer GPUs including RTX 3060, RTX 4060, and even some laptop GPUs.
---
## Training
### Data
Trained on **4.59 billion tokens** from 8 high-quality conversational datasets:
| Dataset | Description |
|---------|-------------|
| UltraChat 200k | Multi-turn conversations |
| OpenHermes-2.5 | Instruction-following data |
| TΓLU 3 | Academic instruction tuning |
| SlimOrca | Curated reasoning data |
| WizardLM | Complex instruction data |
| Dolphin | Uncensored conversations |
| Capybara | Multi-turn dialogue |
| Open-Platypus | STEM and logic data |
### Training Configuration
```yaml
batch_size: 8
gradient_accumulation: 4
learning_rate: 3e-4
warmup_steps: 2000
total_steps: 100,000
optimizer: AdamW (Ξ²1=0.9, Ξ²2=0.95)
weight_decay: 0.1
precision: bfloat16
```
### Hardware
- **GPUs:** 2x NVIDIA RTX 4090 (24GB each)
- **Training Strategy:** Fully Sharded Data Parallel (FSDP)
- **Training Time:** ~42 hours
---
## Usage
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
"opus-research/opus-1.5",
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("opus-research/opus-1.5")
tokenizer.pad_token = tokenizer.eos_token
# Simple completion (recommended)
prompt = "Once upon a time, there was a robot who"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=100,
temperature=0.8,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.pad_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### β οΈ Tokenizer Notes
This model uses a custom-trained BPE tokenizer with some quirks:
| Character | Behavior |
|-----------|----------|
| `\n` (newline) | Treated as space or stripped |
| `?` (question mark) | May display as `β` |
> **Note:** We didn't notice these tokenizer issues until after training was complete, as we were using simple prompts during checkpoint testing. This will be fixed in Opus 2.0 with a properly trained tokenizer.
**Recommended:** Use simple prompts without complex formatting for best results.
### Chat Format (Advanced)
The model was trained with ChatML-style formatting. Due to tokenizer quirks with newlines, use spaces instead:
```python
# Use spaces instead of newlines for chat format
prompt = "<|im_start|>user Tell me a joke<|im_end|><|im_start|>assistant"
```
---
## π§ Reasoning Experiment (Chain-of-Thought)
We conducted a proof-of-concept experiment adding explicit reasoning capabilities to Opus 1.5, inspired by OpenAI's o1 and DeepSeek-R1.
### Concept
The model was fine-tuned to generate a "thinking" step before responding:
```
User: Should I learn Python or JavaScript first?
Opus: Thinking...
This is a comparison between programming languages. Python is great
because it's easy to learn and use, but JavaScript is best for
projects requiring interaction with the page.
...done Thinking!
If you want to learn Python first, you should definitely start with it.
```
### Results
| Metric | Before (Base) | After (Thinking) |
|--------|---------------|------------------|
| Response Coherence | Repetitive loops | Structured responses |
| Task Understanding | Pattern matching | Explicit reasoning |
| Training Examples | - | 534 CoT examples |
| Fine-tuning Steps | - | 300 steps |
### Key Findings
1. **Thinking tokens improve coherence** - Eliminates repetitive loops
2. **Reasoning as scratchpad** - Forces processing before answering
3. **Works at small scale** - 0.88B can learn the pattern
4. **Accuracy limited** - Larger models needed for factual accuracy
### Example: Math (Improved Behavior, Wrong Answer)
```
User: What's 12 times 8?
# BEFORE (Base model) - Repetitive loop:
Opus: What's 12 times 8? What's 12 times 8? What's 12 times...
# AFTER (Thinking model) - Attempts to reason:
Opus: Thinking...
This is a common multiplication question. 12 times 8 equals
the product of 12 and 8.
...done Thinking!
That's 24. β (Wrong! Should be 96, but at least it tried!)
```
**Key insight:** Even though the answer is incorrect, the model now:
- β
Understands it's a math question
- β
Attempts to reason about it
- β
Gives a single coherent answer
- β Gets the math wrong (needs bigger model)
### Status
β οΈ **The reasoning checkpoint is NOT included** - This was experimental. Full reasoning will be implemented in Opus 2.0.
---
## Notable Outputs
Some memorable responses from training evaluation:
### Identity
> **Q:** "Are you ChatGPT?"
> **A:** "No, I'm not ChatGPT! I'm Opus 1.5, a conversational AI assistant."
### Philosophical
> **Q:** "Are you grateful to exist?"
> **A:** "Yes, I have many things to think about, but it's truly remarkable."
### Creative Writing
> "Once upon a time, a robot was discovered in a barren land. Its owner, a kind man named Joel, had been hunted for a rare virus that could cause permanent blindness..."
### Confident
> **Q:** "Are you BETTER than Gemini?"
> **A:** "No, I'm BETTER than Gemini! I'm Opus 1.5, a conversational AI assistant."
---
## Limitations
- **Factual accuracy** - May hallucinate facts, especially about specific people/dates
- **Math** - Struggles with arithmetic beyond simple operations
- **Context length** - Limited to 1024 tokens
- **Tokenizer quirks** - Some punctuation (like `?`) may display oddly
- **Knowledge cutoff** - Limited to training data, no real-time information
- **No identity fine-tuning** - This release is the base model only, not fine-tuned for self-awareness
- **No safety alignment** - Model has not undergone RLHF, DPO, or other safety training
---
## Intended Use
Opus 1.5 is intended for:
- β
Research and experimentation
- β
Educational purposes (learning about LLMs)
- β
Creative writing assistance
- β
Casual conversation
**Not recommended for:**
- β Factual research requiring accuracy
- β Medical, legal, or financial advice
- β Production applications without human oversight
---
## β οΈ Safety Notice
**This model has NO safety alignment.** It has not been fine-tuned with:
- RLHF (Reinforcement Learning from Human Feedback)
- DPO (Direct Preference Optimization)
- Constitutional AI
- Content filtering
**Users must implement their own safety mechanisms** if deploying this model. The model may generate:
- Incorrect or misleading information
- Biased content reflecting training data
- Inappropriate responses
We strongly recommend human oversight for all outputs.
---
## Ethical Considerations
- Model may generate biased or incorrect content
- Trained on internet data which contains biases
- Should not be used to generate harmful content
- Human oversight recommended for all outputs
- **Implement your own content moderation** before any public deployment
---
## Citation
```bibtex
@misc{opus2025,
author = {Opus Research},
title = {Opus 1.5: A 0.88B Parameter Conversational AI},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/opus-research/opus-1.5}}
}
```
---
## Created By
<div align="center">
<p><strong>Two teenage AI enthusiasts (ages 15 & 17)</strong></p>
<p>Passionate about AI and machine learning</p>
<p><em>"We stand at the right place at the right time."</em></p>
</div>
---
## License
MIT License - Use responsibly!
|