quants based on scaled fp8 model?

#14

by MauriceSazerac - opened 2 days ago

2 days ago

Any chance we could get quants based on the scaled fp8 model?

https://github.com/ModelTC/Qwen-Image-Lightning/?tab=readme-ov-file#-using-lightning-loras-with-fp8-models

apparently this eliminates the grid effect that can come from using certain loras? If what I've experienced lines up with what's mentioned in the link, then it seems the ggufs we have are all based on the downcast bf16 qwen model?
"qwen_image_fp8_e4m3fn.safetensors model was produced by directly downcasting the original bf16 weights, rather than employing a calibrated conversion process with appropriate scaling."

scaled fp8 model: https://huggingface.co/lightx2v/Qwen-Image-Lightning/blob/main/Qwen-Image/qwen_image_fp8_e4m3fn_scaled.safetensors

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment