Output truncated without reason

#12

by shaosh - opened Jul 30

Discussion

shaosh

Jul 30

This comment has been hidden (marked as Resolved)

kevinpro

Jul 30

system info

key value

transformers version 4.51.3

PyTorch version 2.6.0

vllm version 0.8.0

GPU 4090*1

Reproduction

I tried running this model, but the output was always staged and displayed as a length stage, but there was no problem with the configuration.

config
python3 -m vllm.entrypoints.openai.api_server --max_model_len 4096 --served-model-name seed-x --model /data/models/Seed-X-PPO-7B

key	value
transformers version	4.51.3
PyTorch version	2.6.0
vllm version	0.8.0
GPU	4090*1

I noticed the API response returned a finish_reason of length. Could you please share the code snippet for the request? I’d like to check the configuration, especially for parameters like max_tokens

shaosh

Jul 30

This comment has been hidden (marked as Resolved)

shaosh

Jul 30

This comment has been hidden (marked as Resolved)

shaosh

Jul 30

system info

key value

transformers version 4.51.3

PyTorch version 2.6.0

vllm version 0.8.0

GPU 4090*1

Reproduction

I tried running this model, but the output was always staged and displayed as a length stage, but there was no problem with the configuration.

config
python3 -m vllm.entrypoints.openai.api_server --max_model_len 4096 --served-model-name seed-x --model /data/models/Seed-X-PPO-7B
I noticed the API response returned a finish_reason of length. Could you please share the code snippet for the request? I’d like to check the configuration, especially for parameters like max_tokens

key	value
transformers version	4.51.3
PyTorch version	2.6.0
vllm version	0.8.0
GPU	4090*1

Thank

sevenssss

Jul 31

You can include max_token in your request，such as
{
"model": "xx",
"prompt": "",
"max_tokens": 512
}

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment