Parallel Tool Calling not working in latest commit

#151
by mirzads - opened

Latest commit only returns answer for San Francisco even when in the reasoning content it decides to call it twice.

Commit from August returned answer for both San Francisco and Japan.

Request:

"messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is the weather in San Francisco and Japan?"
        }
      ]
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "weather",
        "description": "Get the weather in a location",
        "parameters": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "location": {
              "description": "The location to get the weather for",
              "type": "string"
            }
          },
          "required": [
            "location"
          ],
          "additionalProperties": false
        }
      }
    }
  ],

Response:

    "object": "chat.completion",
    "model": "openai/gpt-oss-120b",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "reasoning_content": "User asks for weather in two locations. We need to call function weather twice, for each location.\n\nWe should maybe call function with array? The function expects a single location string. We'll call twice.\n\nWe'll call for San Francisco, then for Japan.",
                "tool_calls": [
                    {
                        "id": "chatcmpl-tool-edfa12324154345fcfa08123",
                        "type": "function",
                        "function": {
                            "name": "weather",
                            "arguments": "{\"location\": \"San Francisco\"}"
                        }
                    }
                ]
            },
            "finish_reason": "tool_calls"
        }
    ],
    "created": 1761679671,
    "created_at": "2025-10-28T19:27:52.971Z",
    "usage": {
        "completion_tokens": 78,
        "prompt_tokens": 137,
        "total_tokens": 215
    },

I've identified the commit that caused the break in parallel tool calling for gpt-oss-120b:

I can confirm that checking out to the earlier commit (988e56b) restores proper parallel tool calling, while the latest version only returns a single tool call even when multiple are needed.

@dkundel-openai Is it possible to undo the commit mentioned by @hcnguyen (thank you)? https://huggingface.co/openai/gpt-oss-120b/commit/8b193b0 . Would it break other functionalities?

The same problem.

Since 8b193b0 removes <|call|> as the stop token, how can we know when to stop? What is the designed stop token after parallel tool calling? <|return|>?
Documentation states clearly that <|call|> IS a stop token: https://cookbook.openai.com/articles/openai-harmony#prompt-format

Since 8b193b0 removes <|call|> as the stop token, how can we know when to stop? What is the designed stop token after parallel tool calling? <|return|>?
Documentation states clearly that <|call|> IS a stop token: https://cookbook.openai.com/articles/openai-harmony#prompt-format

Yes, this is really confusing. Parallel tool calling should have multiple <|call|> tokens, but the stop token refuses the latter tool call result.

<|channel|>analysis<|message|>We need to call function twice? Actually user wants resistance for both copper and aluminum. So call calculate_resistance twice.<|end|><|start|>assistant<|channel|>commentary to=functions.calculate_resistance <|constrain|>json<|message|>{"length":5,"area":0.01,"resistivity":"copper"}<|call|>commentary<|message|>{"length":5,"area":0.01,"resistivity":"aluminum"}<|call|>commentary<|message|>The resistance of a wire is given by...

The secondary function gets called if we dont stop after <|call|>, however it doesnt seem like there is a token that ends parallel tool call sequence, any help is appreciated

Sign up or log in to comment