ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Description

Orchestrator-8B is a state-of-the-art 8B parameter orchestration model designed to solve complex, multi-turn agentic tasks by coordinating a diverse set of expert models and tools.

On the Humanity's Last Exam (HLE) benchmark, ToolOrchestrator-8B achieves a score of 37.1%, outperforming GPT-5 (35.1%) while being approximately 2.5x more efficient.

This model is for research and development only.

Key Features

Intelligent Orchestration: Capable of managing heterogeneous toolsets including basic tools (search, code execution) and other LLMs (specialized and generalist).
Multi-Objective RL Training: Trained via Group Relative Policy Optimization (GRPO) with a novel reward function that optimizes for accuracy, latency/cost, and adherence to user preferences.
Efficiency: Delivers higher accuracy at significantly lower computational cost compared to monolithic frontier models.
Robust Generalization: Demonstrated ability to generalize to unseen tools and pricing configurations.

Benchmark

On Humanity’s Last Exam, Orchestrator-8B achieves 37.1%, surpassing GPT-5 (35.1%) with only 30% monetary cost and 2.5x faster. On FRAMES and τ²-Bench, Orchestrator-8B consistently outperforms strong monolithic systems, demonstrating versatile reasoning and robust tool orchestration.

Orchestrator-8B consistently outperforms GPT-5, Claude Opus 4.1 and Qwen3-235B-A22B on HLE with substantially lower cost.

Model Details

Developed by: NVIDIA & University of Hong Kong
Model Type: Decoder-only Transformer
Base Model: Qwen3-8B
Parameters: 8B
Language(s): English
License: NVIDIA License

Model Version(s):

1.0

Training Dataset:

Link:

Dataset	Link
GeneralThought-430K	Link
ToolScale	Link

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.

License/Terms of Use

NVIDIA License

Citation

If you find this model useful, please cite our paper:

@article{toolorchestra,
  title={ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration},
  author={Su, Hongjin and Diao, Shizhe and Lu, Ximing and Liu, Mingjie and Xu, Jiacheng and Dong, Xin and Fu, Yonggan and Belcak, Peter and Ye, Hanrong and Yin, Hongxu and Dong, Yi and Bakhturina, Evelina and Yu, Tao and Choi, Yejin and Kautz, Jan and Molchanov, Pavlo}
  journal={arXiv preprint arXiv:XXXX},
  year={2025}
}