NanoSOTA-v2-General

Description: Final Generalist (Logic + Algebra).

How to Load (Required)

This model uses a custom 8-Expert MoE architecture. You must use the provided loader script.

from nanosota_moe import load_nanosota
model, tokenizer = load_nanosota("oscarz511/NanoSOTA-v2-General")

prompt = "If it takes 3 hours to dry 3 shirts, how long for 30 shirts?"
inputs = tokenizer.apply_chat_template([
    {"role": "system", "content": "You are NanoSOTA. Think step-by-step."}, 
    {"role": "user", "content": prompt}
], return_tensors="pt", add_generation_prompt=True).to("cuda")

out = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(out[0]))
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for oscarz511/NanoSOTA-v2-General

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(595)
this model