Orca-Agent-v0.1

In depth details behind the training, including the training code, is all open sourced here

Description

Orca-Agent-v0.1 is an orchestration agent that acts as the brain of the operation - it receives the user's task but never touches code directly. Instead, it:

Analyses the task and breaks it into focused subtasks
Dispatches explorer agents to understand the system
Delegates implementation work to coder agents with precise instructions
Verifies all changes through additional explorer agents
Maintains the context store with all discovered knowledge

Performance

Orchestrator	Subagent	Terminal Bench
Qwen3-Coder-480B	Qwen3-Coder-480B	19.7%
Orca-Agent-v0.1-14B	Qwen3-Coder-30B	18.25%
Qwen3-14B	Qwen3-Coder-30B	7.0%

This places Orca-Agent-v0.1 (14B) + Qwen3-Coder-Flash (30B MoE), within striking distance of Qwen3-Coder-480B using the same architecture which placed #26 on TerminalBench when it was published recently in my other project.

Example model output

<think>
Okay, now with task_003 identifying a backup, the next step would be to attempt recovery using this identified file.
</think>

<task_create>
agent_type: 'coder'
title: 'Attempt recovery using the identified backup file'
description: |
  Execute the following steps to recover using the backup identified in task_003:
  1. Move the identified backup file (/tmp/terraform_work/.terraform.tfstate.tmp) to the target location (/infrastructure/recovered_state.json).
  2. Verify the operation by checking:
     - The existence of the new file at the target location
     - The size matches the original backup file
     - The file permissions are set appropriately (rw-r--r--)
max_turns: 10
context_bootstrap:
  - path: '/tmp/terraform_work/.terraform.tfstate.tmp'
    reason: 'The backup file identified in task_003'
context_refs:
  - 'task_003'
</task_create>

Model training Overview

Full fine-tune of Qwen3-14b
32x H100s
- 16x for training
- 8x inference for Orca-Agent
- 8x inference for subagent (Qwen3-Coder-30B-A3B)
Trained with GRPO + curriculum learning
Batch size 256, 64 rollouts per task
More details here

Serving model

vLLM

vllm serve Danau5tin/Orca-Agent-v0.1

SGLang

python -m sglang.launch_server \
--model-path Danau5tin/Orca-Agent-v0.1

The agent's orchestration code can be found here.

Downloads last month: 421

Safetensors

Model size

15B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Danau5tin/Orca-Agent-v0.1

Base model

willcb/Qwen3-14B

Finetuned

(2)

this model

Quantizations

6 models

Danau5tin
/

Orca-Agent-v0.1