Update README.md
Browse files
README.md
CHANGED
|
@@ -15,9 +15,6 @@ library_name: mlx
|
|
| 15 |
|
| 16 |
Hi Spock!
|
| 17 |
We are going to analyze the cognitive abilities of a few quantizations of this model
|
| 18 |
-
- The bf16 is full precision.
|
| 19 |
-
- The q6 is straight quantization with the MLX default settings (group size 64)
|
| 20 |
-
- The hi quants are done with group size 32 for higher fidelity
|
| 21 |
|
| 22 |
The Deckard(qx) quants are in a mixed precision quantization:
|
| 23 |
- qx64x has data at 4 bit, while the attention paths, head, and embeddings are at 6 bit
|
|
|
|
| 15 |
|
| 16 |
Hi Spock!
|
| 17 |
We are going to analyze the cognitive abilities of a few quantizations of this model
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
The Deckard(qx) quants are in a mixed precision quantization:
|
| 20 |
- qx64x has data at 4 bit, while the attention paths, head, and embeddings are at 6 bit
|