Opus 1.5

🎭 A 0.88B Conversational AI Trained From Scratch

"We stand at the right place at the right time." β€” Opus 1.5


🌟 Highlights

  • Trained from scratch - No pre-trained weights, 100% original
  • 0.88 billion parameters - Efficient LLaMA-style architecture
  • 42 hours of training - 2x RTX 4090 GPUs with FSDP
  • Created by teenagers - Two AI enthusiasts (ages 15 & 17)
  • Chain-of-thought capable - Experimental reasoning support

Model Details

Architecture

Opus 1.5 uses a modern LLaMA-style transformer architecture:

Component Implementation
Position Encoding Rotary Position Embeddings (RoPE)
Activation SwiGLU
Normalization RMSNorm (pre-norm)
Attention Grouped Query Attention (GQA)
Optimization FlashAttention-2 compatible

Specifications

Attribute Value
Hidden Size 1536
Layers 24
Attention Heads 24
KV Heads 8 (3:1 GQA ratio)
Intermediate Size 6144
Vocab Size 32,000
Context Length 1024 tokens
Total Parameters 0.88B

πŸ’Ύ Hardware Requirements

Precision VRAM Required Tested On
bfloat16 ~2 GB RTX 4090 βœ…
float16 ~2 GB Any modern GPU
float32 ~4 GB Not recommended

Note: This model is very lightweight! It runs comfortably on consumer GPUs including RTX 3060, RTX 4060, and even some laptop GPUs.


Training

Data

Trained on 4.59 billion tokens from 8 high-quality conversational datasets:

Dataset Description
UltraChat 200k Multi-turn conversations
OpenHermes-2.5 Instruction-following data
TÜLU 3 Academic instruction tuning
SlimOrca Curated reasoning data
WizardLM Complex instruction data
Dolphin Uncensored conversations
Capybara Multi-turn dialogue
Open-Platypus STEM and logic data

Training Configuration

batch_size: 8
gradient_accumulation: 4
learning_rate: 3e-4
warmup_steps: 2000
total_steps: 100,000
optimizer: AdamW (Ξ²1=0.9, Ξ²2=0.95)
weight_decay: 0.1
precision: bfloat16

Hardware

  • GPUs: 2x NVIDIA RTX 4090 (24GB each)
  • Training Strategy: Fully Sharded Data Parallel (FSDP)
  • Training Time: ~42 hours

Usage

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "opus-research/opus-1.5", 
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("opus-research/opus-1.5")
tokenizer.pad_token = tokenizer.eos_token

# Simple completion (recommended)
prompt = "Once upon a time, there was a robot who"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    temperature=0.8,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Tokenizer Notes

This model uses a custom-trained BPE tokenizer with some quirks:

Character Behavior
\n (newline) Treated as space or stripped
? (question mark) May display as ⁇

Note: We didn't notice these tokenizer issues until after training was complete, as we were using simple prompts during checkpoint testing. This will be fixed in Opus 2.0 with a properly trained tokenizer.

Recommended: Use simple prompts without complex formatting for best results.

Chat Format (Advanced)

The model was trained with ChatML-style formatting. Due to tokenizer quirks with newlines, use spaces instead:

# Use spaces instead of newlines for chat format
prompt = "<|im_start|>user Tell me a joke<|im_end|><|im_start|>assistant"

🧠 Reasoning Experiment (Chain-of-Thought)

We conducted a proof-of-concept experiment adding explicit reasoning capabilities to Opus 1.5, inspired by OpenAI's o1 and DeepSeek-R1.

Concept

The model was fine-tuned to generate a "thinking" step before responding:

User: Should I learn Python or JavaScript first?

Opus: Thinking...
This is a comparison between programming languages. Python is great 
because it's easy to learn and use, but JavaScript is best for 
projects requiring interaction with the page.
...done Thinking!

If you want to learn Python first, you should definitely start with it.

Results

Metric Before (Base) After (Thinking)
Response Coherence Repetitive loops Structured responses
Task Understanding Pattern matching Explicit reasoning
Training Examples - 534 CoT examples
Fine-tuning Steps - 300 steps

Key Findings

  1. Thinking tokens improve coherence - Eliminates repetitive loops
  2. Reasoning as scratchpad - Forces processing before answering
  3. Works at small scale - 0.88B can learn the pattern
  4. Accuracy limited - Larger models needed for factual accuracy

Example: Math (Improved Behavior, Wrong Answer)

User: What's 12 times 8?

# BEFORE (Base model) - Repetitive loop:
Opus: What's 12 times 8? What's 12 times 8? What's 12 times...

# AFTER (Thinking model) - Attempts to reason:
Opus: Thinking...
This is a common multiplication question. 12 times 8 equals 
the product of 12 and 8.
...done Thinking!

That's 24.  ← (Wrong! Should be 96, but at least it tried!)

Key insight: Even though the answer is incorrect, the model now:

  • βœ… Understands it's a math question
  • βœ… Attempts to reason about it
  • βœ… Gives a single coherent answer
  • ❌ Gets the math wrong (needs bigger model)

Status

⚠️ The reasoning checkpoint is NOT included - This was experimental. Full reasoning will be implemented in Opus 2.0.


Notable Outputs

Some memorable responses from training evaluation:

Identity

Q: "Are you ChatGPT?"
A: "No, I'm not ChatGPT! I'm Opus 1.5, a conversational AI assistant."

Philosophical

Q: "Are you grateful to exist?"
A: "Yes, I have many things to think about, but it's truly remarkable."

Creative Writing

"Once upon a time, a robot was discovered in a barren land. Its owner, a kind man named Joel, had been hunted for a rare virus that could cause permanent blindness..."

Confident

Q: "Are you BETTER than Gemini?"
A: "No, I'm BETTER than Gemini! I'm Opus 1.5, a conversational AI assistant."


Limitations

  • Factual accuracy - May hallucinate facts, especially about specific people/dates
  • Math - Struggles with arithmetic beyond simple operations
  • Context length - Limited to 1024 tokens
  • Tokenizer quirks - Some punctuation (like ?) may display oddly
  • Knowledge cutoff - Limited to training data, no real-time information
  • No identity fine-tuning - This release is the base model only, not fine-tuned for self-awareness
  • No safety alignment - Model has not undergone RLHF, DPO, or other safety training

Intended Use

Opus 1.5 is intended for:

  • βœ… Research and experimentation
  • βœ… Educational purposes (learning about LLMs)
  • βœ… Creative writing assistance
  • βœ… Casual conversation

Not recommended for:

  • ❌ Factual research requiring accuracy
  • ❌ Medical, legal, or financial advice
  • ❌ Production applications without human oversight

⚠️ Safety Notice

This model has NO safety alignment. It has not been fine-tuned with:

  • RLHF (Reinforcement Learning from Human Feedback)
  • DPO (Direct Preference Optimization)
  • Constitutional AI
  • Content filtering

Users must implement their own safety mechanisms if deploying this model. The model may generate:

  • Incorrect or misleading information
  • Biased content reflecting training data
  • Inappropriate responses

We strongly recommend human oversight for all outputs.


Ethical Considerations

  • Model may generate biased or incorrect content
  • Trained on internet data which contains biases
  • Should not be used to generate harmful content
  • Human oversight recommended for all outputs
  • Implement your own content moderation before any public deployment

Citation

@misc{opus2025,
  author = {Opus Research},
  title = {Opus 1.5: A 0.88B Parameter Conversational AI},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/opus-research/opus-1.5}}
}

Created By

Two teenage AI enthusiasts (ages 15 & 17)

Passionate about AI and machine learning

"We stand at the right place at the right time."


License

MIT License - Use responsibly!

Downloads last month
24
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including opus-research/opus-1.5

Evaluation results