Endoftext spam at the end of every request
#2
by
rageltman
- opened
Running with candle-vllm or vllm.rs at q8_0 we're seeing this effect at the end of every response:
... or include additional safety checks?<|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>....
On the code-gen side, does seem to do a reasonable job producing logical output but unfortunately can't do anything turn-based when it never completes its response (just maxes the output window with ^^)
Thank you for your interest!
We’ll be releasing an official quantized version soon — stay tuned!