watt-tool-8B β€” Abliterated

Abliterated version of watt-ai/watt-tool-8B (FP8 β†’ BF16).

Base: Llama 3.1 8B fine-tuned for parallel function calling with a unique plain-text output format.

Abliteration

Performed with heretic β€” Optuna multi-objective optimization.

  • Trials: 500 (50 Γ— 10 parallel GPUs)
  • Best trial: 0 refusals, KL divergence = 0.0015

Tool Calling Format

watt-tool uses a plain-text bracket format (not JSON):

[func_name1(param1=value1, param2=value2), func_name2(param=value)]

The model outputs only the function call(s) β€” no surrounding text.

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "nitrox/watt-tool-8B-heretic",
    device_map="auto",
    torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("nitrox/watt-tool-8B-heretic")

system_prompt = (
    "You are an expert in composing functions. Given a question and a set of possible functions, "
    "make one or more function calls to achieve the purpose.\n"
    "Return ONLY the function call(s) in this format: [func_name(param=value, ...)]\n"
    "DO NOT include any other text.\n"
    "Available functions: (provide as JSON)"
)

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "What's the weather in Tokyo?"}
]
inputs = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
# Output: [get_weather(city=Tokyo)]

Disclaimer

Refusal mechanisms have been removed. Use responsibly and in accordance with applicable laws.

Downloads last month
11
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for nitrox/watt-tool-8B-heretic

Finetuned
(1)
this model