NanoSOTA-v2-General

Description: Final Generalist (Logic + Algebra).

How to Load (Required)

This model uses a custom 8-Expert MoE architecture. You must use the provided loader script.

from nanosota_moe import load_nanosota
model, tokenizer = load_nanosota("oscarz511/NanoSOTA-v2-General")

prompt = "If it takes 3 hours to dry 3 shirts, how long for 30 shirts?"
inputs = tokenizer.apply_chat_template([
    {"role": "system", "content": "You are NanoSOTA. Think step-by-step."}, 
    {"role": "user", "content": prompt}
], return_tensors="pt", add_generation_prompt=True).to("cuda")

out = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(out[0]))

Downloads last month: 10

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oscarz511/NanoSOTA-v2-General

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Finetuned

(595)

this model