---
base_model:
- willcb/Qwen3-14B
license: apache-2.0
datasets:
- Danau5tin/terminal-tasks
tags:
- agent
- code
- multi-agent
---

# Orca-Agent-v0.1

![Orca-Agent-v0.1 banner image](./orca-agent-v01-banner.png)

In depth details behind the training, including the training code, is **all open sourced [here](https://github.com/Danau5tin/Orca-Agent-RL)**

## Description
Orca-Agent-v0.1 is an orchestration agent that acts as the brain of the operation - it receives the user's task but never touches code directly. Instead, it:

- Analyses the task and breaks it into focused subtasks
- Dispatches explorer agents to understand the system
- Delegates implementation work to coder agents with precise instructions
- Verifies all changes through additional explorer agents
- Maintains the context store with all discovered knowledge

## Performance
| Orchestrator | Subagent | Terminal Bench |
|--------------|----------|--------|
| Qwen3-Coder-480B | Qwen3-Coder-480B | 19.7% |
| Orca-Agent-v0.1-14B | Qwen3-Coder-30B | 18.25% |
| Qwen3-14B | Qwen3-Coder-30B | 7.0% |

This places Orca-Agent-v0.1 (14B) + Qwen3-Coder-Flash (30B MoE), within striking distance of [Qwen3-Coder-480B](https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct) using the same architecture which placed #26 on TerminalBench when it was published recently in [my other project](https://github.com/Danau5tin/multi-agent-coding-system).

## Example model output
```xml
<think>
Okay, now with task_003 identifying a backup, the next step would be to attempt recovery using this identified file.
</think>

<task_create>
agent_type: 'coder'
title: 'Attempt recovery using the identified backup file'
description: |
  Execute the following steps to recover using the backup identified in task_003:
  1. Move the identified backup file (/tmp/terraform_work/.terraform.tfstate.tmp) to the target location (/infrastructure/recovered_state.json).
  2. Verify the operation by checking:
     - The existence of the new file at the target location
     - The size matches the original backup file
     - The file permissions are set appropriately (rw-r--r--)
max_turns: 10
context_bootstrap:
  - path: '/tmp/terraform_work/.terraform.tfstate.tmp'
    reason: 'The backup file identified in task_003'
context_refs:
  - 'task_003'
</task_create>
```

## Model training Overview
- Full fine-tune of Qwen3-14b
- 32x H100s
  - 16x for training
  - 8x inference for Orca-Agent
  - 8x inference for subagent (Qwen3-Coder-30B-A3B)
- Trained with GRPO + curriculum learning
- Batch size 256, 64 rollouts per task
- More details [here](https://github.com/Danau5tin/Orca-Agent-RL)

## Serving model

**vLLM**
```bash
vllm serve Danau5tin/Orca-Agent-v0.1
```

**SGLang**
```bash
python -m sglang.launch_server \
--model-path Danau5tin/Orca-Agent-v0.1
```

The agent's orchestration code can be found [here](https://github.com/Danau5tin/multi-agent-coding-system).