Requesting README.md update on how to run the model on vLLM with tool-calling support
#11
by
douglasrfaisal-gl
- opened
Requesting to update the README.md so that users can run the model on vLLM with tool-calling.
I had the following problems when using the model on vLLM with tool-calling as instructed in the guide:
- Currently there are no guide on which flags to use (perhaps you can add a link to https://qwen.readthedocs.io/en/latest/deployment/vllm.html#parsing-tool-calls)
- Once I've tried the above link, the vLLM failed to yield a successful tool call with the following error message:
ValueError: As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.
The following workaround worked for me:
- Obtain the chat template from the repository and extract the jinja content
git clone https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct python3 -c "import json,sys; print(json.load(open('./Qwen3-Omni-30B-A3B-Instruct/chat_template.json')).get('chat_template',''))" > chat_template.jinja - Run
vllm servewith chat_template flag--chat_template ./chat_template.jinja