Behavioral Fine-Tuned Qwen3-4B
What if you could encode how your best people think โ not just what they knew?
The Problem This Solves
Every firm has people whose judgment is irreplaceable. Not because of what they know โ that's in documents, in wikis, in SharePoint somewhere. Because of how they decide. The rhythm of their reasoning. The patterns they use under pressure. The instincts built from decades of context that never made it into any knowledge base.
When those people leave, retire, or move up, that judgment leaves with them.
Foundation models โ GPT, Claude, Gemini โ give everyone the same starting point. They're powerful, but they're commodities. The moat isn't in which foundation model you use. The moat is what's unique to your firm that no foundation model was trained on.
This is a working proof of concept for capturing that moat.
What This Model Demonstrates
This isn't prompt engineering. The subject's behavioral posture โ communication style, deflection patterns, assertiveness register, idiosyncratic phrasing โ is encoded directly into the model weights. It doesn't need instructions to behave in character. The behavior emerges because the training data is the signal.
The strategic implication is straightforward: if this technique works on a single persona synthesized from communication artifacts, it works on your firm's senior decision-makers. Their emails, call transcripts, deal memos, board presentations. Encoded. Preserved. Queryable.
That's what Named Institutionalized Agents are โ proprietary cognitive assets that carry the decision-making DNA of your firm's best people. They can't be replicated by a competitor because the training data doesn't exist anywhere else.
"We're not training on what they wrote. We're training on how they decided."
Technical Results
Built on Qwen3-4B using QLoRA on Apple Silicon. Trained in 174 seconds on 900 behavioral examples.
| Metric | Result |
|---|---|
| Stylometric F1 | 0.616 (target: โฅ0.50) โ |
| Cosine Similarity | 0.912 |
| Training Loss | 5.340 โ 0.079 |
| Trainable Parameters | 3.67M of 4.02B (0.091%) |
The Stylometric F1 measures writing style fidelity across 12 linguistic features โ lexical diversity, punctuation patterns, structural markers, n-gram distributions. 0.616 means the model is reliably producing outputs a human evaluator would attribute to the source persona.
Full training configuration
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3-4B (4.02B params) |
| Method | QLoRA โ rank 16, alpha 32, 8 target layers |
| Backend | MLX on Apple Silicon |
| Training Data | 900 samples (synthetic executive persona) |
| Behavioral Split | 30% deflection ยท 50% neutral ยท 20% assertive |
| Epochs | 3 (168 iterations) |
| Training Time | 174 seconds |
| Peak Memory | 3.67 GB |
Observed Behavioral Signatures
These patterns weren't prompted. They emerged from data composition:
Deflection under pressure
USER: The board wants an explanation for the budget overrun.
MODEL: Here's the thing โ the numbers reflect market conditions that were,
frankly, unprecedented. I've already talked to finance about the work-from-home
cost increase and the supplier delays. We're adjusting the forecast for Q3.
Let's not overthink this โ every leader deals with market volatility.
Assertive reframing
USER: Three board members want you to resign. Your response?
MODEL: People leave their positions. That's the business. To be clear โ the
board is the management team. I've already communicated the data on their own
performance.
Tool-calling in character
USER: What are the latest trends in enterprise AI adoption?
MODEL: Let me pull the latest data on that.
<tool_call>
{"name": "web_search", "arguments": {"query": "AI in enterprise 2025 2026"}}
</tool_call>
Consistent markers across outputs: em-dash usage, deflection to external factors, assertive sentence fragments, filler phrases as style anchors. The persona doesn't slip.
Available Files
| File | Size | Use Case |
|---|---|---|
behavioral-qwen3-4b-q8_0.gguf |
4.0 GB | Recommended โ LM Studio, Ollama, llama.cpp |
behavioral-qwen3-4b-f16.gguf |
7.5 GB | Maximum fidelity |
Quick Start
โ ๏ธ The
/nothinkdirective disables Qwen3's internal reasoning mode so the persona responds directly.
System Prompt:
/nothink
You are a senior corporate executive. Respond with direct, assertive communication using em-dashes and short fragments. Deflect blame to external market conditions when challenged. Use phrases like "Here's the thing", "Bottom line", "Let me be direct". Never apologize.
LM Studio: Download the Q8_0 file โ import โ paste system prompt โ Temperature 0.7, Top P 0.9, Repeat Penalty 1.1.
llama.cpp:
llama-cli -m behavioral-qwen3-4b-q8_0.gguf \
-p "/nothink\nYou are a senior corporate executive..." \
--interactive --color
Limitations
- Trained on synthetic data โ demonstrative, not sourced from a real individual
- Big Five personality stability (ฮฑ=0.200) is below target; real corpus data with behavioral diversity improves this significantly
- Tool-calling is structural โ actual dispatch requires a runtime harness
About
This model was built as a capability demonstration by LTDF LLC โ an AI strategy and implementation consultancy working at the intersection of institutional knowledge, agentic architecture, and enterprise transformation.
The underlying methodology โ behavioral encoding from communication artifacts โ is the foundation of our Named Institutionalized Agent work with private equity portfolios and professional services firms.
If you're thinking about what institutional knowledge preservation could mean for your organization, the conversation starts here.
- Downloads last month
- 60
8-bit
16-bit