Neuro-Orchestrator-8B
The Autonomous Adaptive Research Engine
Logic of HiPO + Structure of Nemotron + Skills of MiroThinker
π§ Model Overview
Neuro-Orchestrator-8B is a state-of-the-art agentic merge built on the Qwen architecture. It is designed to solve the "always-on" reasoning problem by utilizing a hybrid gating mechanism derived from merging three distinct Qwen-based fine-tunes:
- Analyzes the complexity of the user request first (HiPO Influence).
- Decides whether to answer immediately or enter a deep reasoning loop.
- Plans a structured response using orchestration weights (Nemotron Influence).
- Executes using high-fidelity coding and logic capabilities (MiroThinker Influence).
It was created using the TIES-Merging method on the following base models:
- Base:
miromind-ai/MiroThinker-v1.0-8B(Code, Logic, Tool Use) - Orchestrator:
nvidia/Nemotron-Orchestrator-8B(Planning, Structure) - Controller:
Kwaipilot/HiPO-8B(Reasoning Gating, Efficiency)
π» Usage Code (Bfloat16)
This model is optimized to run in bfloat16 precision. It uses the specific ChatML prompt template native to Qwen models.
Installation
pip install torch transformers accelerate
Inference Script
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# --- CONFIGURATION ---
MODEL_PATH = "yasserrmd/Neuro-Orchestrator-8B"
# Load Tokenizer & Model
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForCausalLM.from_pretrained(
MODEL_PATH,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
print("Model loaded successfully!")
# --- INFERENCE FUNCTION ---
def run_neuro_agent(prompt):
# Qwen/ChatML Format triggers the Orchestrator personality
full_prompt = (
f"<|im_start|>system\n"
f"You are Neuro-Orchestrator. Analyze the request complexity, plan, and execute.<|im_end|>\n"
f"<|im_start|>user\n"
f"{prompt}<|im_end|>\n"
f"<|im_start|>assistant\n"
)
inputs = tokenizer(full_prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=400,
do_sample=True,
temperature=0.6,
repetition_penalty=1.15,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
return response
# --- EXAMPLE USAGE ---
print(run_neuro_agent("Plan a 3-day trip to Tokyo for a couple with a $2000 budget."))
π Capabilities Showcase
The following examples demonstrate the Elastic Reasoning capability: the model is concise on simple tasks but structured and deep on complex tasks.
1. Planning & Orchestration (Nemotron Mode)
The model breaks down complex requests into phased execution plans.
Example: 4-Week YouTube Launch Plan
Prompt: "I want to start a YouTube channel about cooking. Create a 4-week launch roadmap."
Response:
The user wants a structured plan for launching a cooking YouTube channel...
<think>
...For Week 1 focus on setup essentials... Week 2 shifts to refining... Week 3 is critical...
</think>
Launch Roadmap:
- Week 1: Foundation & Setup
- Define niche and branding (logo, channel art).
- Secure equipment (camera, lighting, audio).
- Week 2: Content Batching
- Film first 3-5 videos to have a buffer.
- Edit initial trailer.
- Week 3: Soft Launch & SEO
- Upload first video with optimized titles/tags.
- Promote on social media.
- Week 4: Review & Iterate
- Analyze analytics (CTR, retention).
- Engage with first comments.
2. High-Fidelity Coding (MiroThinker Mode)
Generates functional, clean code with correct imports.
Example: Python Image Resizing
Prompt: "Write a Python script that resizes all images in a folder to 50% dimensions using Pillow."
Response:
from PIL import Image
import os
def resize_images(input_folder):
# Ensure the input folder exists
if not os.path.exists(input_folder):
print(f"The folder {input_folder} does not exist.")
return
# List all files in the input folder
files = os.listdir(input_folder)
# Process each file
for filename in files:
file_path = os.path.join(input_folder, filename)
# Check if file is an image
if os.path.isfile(file_path) and filename.lower().endswith(('.png', '.jpg', '.jpeg')):
with Image.open(file_path) as img:
# Calculate new dimensions
new_width = int(img.width * 0.5)
new_height = int(img.height * 0.5)
# Resize and save
resized_img = img.resize((new_width, new_height))
resized_img.save(file_path)
3. Adaptive Gating (HiPO Mode)
Detects simple queries and skips unnecessary reasoning.
Example: Fact Retrieval
Prompt: "What is the capital of Australia?"
Response: The user's query asks for a straightforward factual answer... This task doesn't demand complex thought.
Answer: The capital of Australia is Canberra.
4. Logic & Reasoning
Handles trick questions by analyzing the premise.
Example: The Lead vs. Feathers Trap
Prompt: "Which is heavier: a pound of feathers or a pound of lead? Explain why people get this wrong."
Response: Both a pound of feathers and a pound of lead weigh exactly the same... However, many people might think that a pound of lead is heavier due to a misconception related to density and volume. Lead is much denser, so it occupies less volume, leading to the intuitive bias that "small and heavy" equals "heavier."
π License
This model is a merge of Qwen-based models. Users should comply with the Apache 2.0 license and the specific terms of the constituent models:
- Downloads last month
- 49