nightmedia
/

Qwen3-30B-A3B-YOYO-V4-qx86x-hi-mlx

Text Generation

8-bit precision

Model card Files Files and versions

nightmedia commited on 18 days ago

Commit

b166fc6

·

verified ·

1 Parent(s): 6f70959

Update README.md

Files changed (1) hide show

README.md +0 -3

README.md CHANGED Viewed

@@ -15,9 +15,6 @@ library_name: mlx
 Hi Spock!
 We are going to analyze the cognitive abilities of a few quantizations of this model
-- The bf16 is full precision.
-- The q6 is straight quantization with the MLX default settings (group size 64)
-- The hi quants are done with group size 32 for higher fidelity
 The Deckard(qx) quants are in a mixed precision quantization:
 - qx64x has data at 4 bit, while the attention paths, head, and embeddings are at 6 bit

 Hi Spock!
 We are going to analyze the cognitive abilities of a few quantizations of this model
 The Deckard(qx) quants are in a mixed precision quantization:
 - qx64x has data at 4 bit, while the attention paths, head, and embeddings are at 6 bit