Orca-Agent-v0.1
In depth details behind the training, including the training code, is all open sourced here
Description
Orca-Agent-v0.1 is an orchestration agent that acts as the brain of the operation - it receives the user's task but never touches code directly. Instead, it:
- Analyses the task and breaks it into focused subtasks
- Dispatches explorer agents to understand the system
- Delegates implementation work to coder agents with precise instructions
- Verifies all changes through additional explorer agents
- Maintains the context store with all discovered knowledge
Performance
| Orchestrator | Subagent | Terminal Bench |
|---|---|---|
| Qwen3-Coder-480B | Qwen3-Coder-480B | 19.7% |
| Orca-Agent-v0.1-14B | Qwen3-Coder-30B | 18.25% |
| Qwen3-14B | Qwen3-Coder-30B | 7.0% |
This places Orca-Agent-v0.1 (14B) + Qwen3-Coder-Flash (30B MoE), within striking distance of Qwen3-Coder-480B using the same architecture which placed #26 on TerminalBench when it was published recently in my other project.
Example model output
<think>
Okay, now with task_003 identifying a backup, the next step would be to attempt recovery using this identified file.
</think>
<task_create>
agent_type: 'coder'
title: 'Attempt recovery using the identified backup file'
description: |
Execute the following steps to recover using the backup identified in task_003:
1. Move the identified backup file (/tmp/terraform_work/.terraform.tfstate.tmp) to the target location (/infrastructure/recovered_state.json).
2. Verify the operation by checking:
- The existence of the new file at the target location
- The size matches the original backup file
- The file permissions are set appropriately (rw-r--r--)
max_turns: 10
context_bootstrap:
- path: '/tmp/terraform_work/.terraform.tfstate.tmp'
reason: 'The backup file identified in task_003'
context_refs:
- 'task_003'
</task_create>
Model training Overview
- Full fine-tune of Qwen3-14b
- 32x H100s
- 16x for training
- 8x inference for Orca-Agent
- 8x inference for subagent (Qwen3-Coder-30B-A3B)
- Trained with GRPO + curriculum learning
- Batch size 256, 64 rollouts per task
- More details here
Serving model
vLLM
vllm serve Danau5tin/Orca-Agent-v0.1
SGLang
python -m sglang.launch_server \
--model-path Danau5tin/Orca-Agent-v0.1
The agent's orchestration code can be found here.
- Downloads last month
- 421
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
