Realtime in CPU

#1
by bobig - opened

orpheus-finetuned-3b@q4_k_m

Sounds pretty good, this update from Mr S tolerates Q4_m compression better than other finetunes.
run about 70 TPS on Mac M4 using 10 performance cores.

Fast enough for real time, no load on GPU

Sign up or log in to comment