quants based on scaled fp8 model?

#14
by MauriceSazerac - opened

Any chance we could get quants based on the scaled fp8 model?

https://github.com/ModelTC/Qwen-Image-Lightning/?tab=readme-ov-file#-using-lightning-loras-with-fp8-models

apparently this eliminates the grid effect that can come from using certain loras? If what I've experienced lines up with what's mentioned in the link, then it seems the ggufs we have are all based on the downcast bf16 qwen model?
"qwen_image_fp8_e4m3fn.safetensors model was produced by directly downcasting the original bf16 weights, rather than employing a calibrated conversion process with appropriate scaling."

scaled fp8 model: https://huggingface.co/lightx2v/Qwen-Image-Lightning/blob/main/Qwen-Image/qwen_image_fp8_e4m3fn_scaled.safetensors

Sign up or log in to comment