Endoftext spam at the end of every request

by rageltman - opened Sep 28

Sep 28

Running with candle-vllm or vllm.rs at q8_0 we're seeing this effect at the end of every response:

... or include additional safety checks?<|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>....

On the code-gen side, does seem to do a reasonable job producing logical output but unfortunately can't do anything turn-based when it never completes its response (just maxes the output window with ^^)

shunxing1234

Kwaipilot org Sep 29

Thank you for your interest!
We’ll be releasing an official quantized version soon — stay tuned!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment