Parsing Non‑Standard Function Calls from GigaChat3‑10B‑A1.8B Model Responses

#6
by gdagil - opened

I am trying to use the OpenAI‑compatible API (the v1/chat/completions endpoint) with the model ai-sage/GigaChat3-10B-A1.8B hosted on Hugging Face. In the request I include a tools section (function‑calling) according to the OpenAI specification:

{
  "model": "ai-sage/GigaChat3-10B-A1.8B",
  "messages": [
    {"role": "user", "content": "Save to my personal memory: prefers short answers"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "manage_user_memory",
        "description": "...",
        "parameters": {
          "type": "object",
          "properties": {
            "content": {"anyOf":[{"type":"string"},{"type":"null"}],"default":null},
            "action": {"type":"string","enum":["create","update","delete"],"default":"create"},
            "id": {"anyOf":[{"type":"string","format":"uuid"},{"type":"null"}],"default":null}
          }
        }
      }
    }
  ]
}

The server’s response looks like this:

{
  "id": "...",
  "model": "ai-sage/GigaChat3-10B-A1.8B",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "<|message_sep|>\n\nfunction call<|role_sep|>\n{\"name\": \"manage_user_memory\", \"arguments\": {\"action\": \"create\", \"content\": \"Prefers short answers\"}}",
        "function_call": null,
        "tool_calls": []
      },
      "finish_reason": "stop"
    }
  ]
}

The model does produce a function call, but it does not populate the official function_call or tool_calls fields defined by OpenAI. Instead, it embeds a serialized function call as a plain string inside the content field. Because of this, client libraries such as openai, LangChain, or vLLM/sglang cannot automatically detect and handle the tool call; it has to be parsed manually

Could anyone share a reliable code snippet (Python, JavaScript, or another language) that extracts the function name and arguments from the content field returned by the model ai-sage/GigaChat3-10B-A1.8B? The response embeds a serialized function call inside a plain string, e.g.:

{
  "role": "assistant",
  "content": "<|message_sep|>\n\nfunction call<|role_sep|>\n{\"name\": \"manage_user_memory\", \"arguments\": {\"action\": \"create\", \"content\": \"Prefers short answers\"}}"
}
ai-sage org

Hi! Please use this function to extract the function name and arguments, make sure there is no EOS at the end:

import json
import re
REGEX_FUNCTION_CALL_V3 = re.compile(r"function call<\|role_sep\|>\n(.*)$", re.DOTALL)
REGEX_CONTENT_PATTERN = re.compile(r"^(.*?)<\|message_sep\|>", re.DOTALL)
def parse_function_and_content(completion_str: str):
    """
    Using the regexes the user provided, attempt to extract function call and content.
    Returns (function_call_str_or_None, content_str_or_None)
    """

    function_call = None
    content = None

    m_func = REGEX_FUNCTION_CALL_V3.search(completion_str)
    if m_func:
        try:
            function_call = json.loads(m_func.group(1))
            if isinstance(function_call, dict) and "name" in function_call and "arguments" in function_call:
                if not isinstance(function_call["arguments"], dict):
                    function_call = None
            else:
                function_call = None
        except json.JSONDecodeError:
            function_call = None

            # will return raw string in failed attempt of function calling
            return function_call, completion_str

    m_content = REGEX_CONTENT_PATTERN.search(completion_str)
    if m_content:
        content = m_content.group(1)
    else:
        # as a fallback, everything before the first message_sep marker if present
        if "<|message_sep|>" in completion_str:
            content = completion_str.split("<|message_sep|>")[0]
        else:
            content = completion_str

    return function_call, content

Sign up or log in to comment