Papers
arxiv:2502.09583

Learning to Coordinate with Experts

Published on Feb 13
Authors:
,
,
,

Abstract

YRC-Bench is an open-source benchmark that evaluates expert-assisted reinforcement learning agents in diverse environments without expert interaction during training.

AI-generated summary

When deployed in the real world, AI agents will inevitably face challenges that exceed their individual capabilities. Leveraging assistance from experts, whether humans or highly capable AI systems, can significantly improve both safety and performance in such situations. Since expert assistance is costly, a central challenge is determining when to consult an expert. In this paper, we explore a novel variant of this problem, termed YRC-0, in which an agent must learn to collaborate with an expert in new environments in an unsupervised manner--that is, without interacting with the expert during training. This setting motivates the development of low-cost, robust approaches for training expert-leveraging agents. To support research in this area, we introduce YRC-Bench, an open-source benchmark that instantiates YRC-0 across diverse environments. YRC-Bench provides a standardized Gym-like API, simulated experts, an evaluation pipeline, and implementations of popular baselines. Toward tackling YRC-0, we propose a validation strategy and evaluate a range of learning methods, offering insights that can inform future research. Codebase: github.com/modanesh/YRC-Bench

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.09583 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.09583 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.09583 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.