Orca-Agent-v0.1

Orca-Agent-v0.1 banner image

In depth details behind the training, including the training code, is all open sourced here

Description

Orca-Agent-v0.1 is an orchestration agent that acts as the brain of the operation - it receives the user's task but never touches code directly. Instead, it:

  • Analyses the task and breaks it into focused subtasks
  • Dispatches explorer agents to understand the system
  • Delegates implementation work to coder agents with precise instructions
  • Verifies all changes through additional explorer agents
  • Maintains the context store with all discovered knowledge

Performance

Orchestrator Subagent Terminal Bench
Qwen3-Coder-480B Qwen3-Coder-480B 19.7%
Orca-Agent-v0.1-14B Qwen3-Coder-30B 18.25%
Qwen3-14B Qwen3-Coder-30B 7.0%

This places Orca-Agent-v0.1 (14B) + Qwen3-Coder-Flash (30B MoE), within striking distance of Qwen3-Coder-480B using the same architecture which placed #26 on TerminalBench when it was published recently in my other project.

Example model output

<think>
Okay, now with task_003 identifying a backup, the next step would be to attempt recovery using this identified file.
</think>

<task_create>
agent_type: 'coder'
title: 'Attempt recovery using the identified backup file'
description: |
  Execute the following steps to recover using the backup identified in task_003:
  1. Move the identified backup file (/tmp/terraform_work/.terraform.tfstate.tmp) to the target location (/infrastructure/recovered_state.json).
  2. Verify the operation by checking:
     - The existence of the new file at the target location
     - The size matches the original backup file
     - The file permissions are set appropriately (rw-r--r--)
max_turns: 10
context_bootstrap:
  - path: '/tmp/terraform_work/.terraform.tfstate.tmp'
    reason: 'The backup file identified in task_003'
context_refs:
  - 'task_003'
</task_create>

Model training Overview

  • Full fine-tune of Qwen3-14b
  • 32x H100s
    • 16x for training
    • 8x inference for Orca-Agent
    • 8x inference for subagent (Qwen3-Coder-30B-A3B)
  • Trained with GRPO + curriculum learning
  • Batch size 256, 64 rollouts per task
  • More details here

Serving model

vLLM

vllm serve Danau5tin/Orca-Agent-v0.1

SGLang

python -m sglang.launch_server \
--model-path Danau5tin/Orca-Agent-v0.1

The agent's orchestration code can be found here.

Downloads last month
421
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Danau5tin/Orca-Agent-v0.1

Base model

willcb/Qwen3-14B
Finetuned
(2)
this model
Quantizations
6 models

Dataset used to train Danau5tin/Orca-Agent-v0.1