TAO71-AI Quants: Qwen3
Collection
16 items
โข
Updated
| Quant | Size | Description |
|---|---|---|
| Q2_K | 839.13 MB | Not recommended for most people. Very low quality. |
| Q2_K_L | 1.1 GB | Not recommended for most people. Uses Q8_0 for output and embedding, and Q2_K for everything else. Very low quality. |
| Q2_K_XL | 1.65 GB | Not recommended for most people. Uses F16 for output and embedding, and Q2_K for everything else. Very low quality. |
| Q3_K_S | 954.59 MB | Not recommended for most people. Prefer any bigger Q3_K quantization. Low quality. |
| Q3_K_M | 1023.52 MB | Not recommended for most people. Low quality. |
| Q3_K_L | 1.06 GB | Not recommended for most people. Low quality. |
| Q3_K_XL | 1.31 GB | Not recommended for most people. Uses Q8_0 for output and embedding, and Q3_K_L for everything else. Low quality. |
| Q3_K_XXL | 1.86 GB | Not recommended for most people. Uses F16 for output and embedding, and Q3_K_L for everything else. Low quality. |
| Q4_K_S | 1.15 GB | Recommended. Slightly low quality. |
| Q4_K_M | 1.19 GB | Recommended. Decent quality for most use cases. |
| Q4_K_L | 1.41 GB | Recommended. Uses Q8_0 for output and embedding, and Q4_K_M for everything else. Decent quality. |
| Q4_K_XL | 1.95 GB | Recommended. Uses F16 for output and embedding, and Q4_K_M for everything else. Decent quality. |
| Q5_K_S | 1.35 GB | Recommended. High quality. |
| Q5_K_M | 1.37 GB | Recommended. High quality. |
| Q5_K_L | 1.55 GB | Recommended. Uses Q8_0 for output and embedding, and Q5_K_M for everything else. High quality. |
| Q5_K_XL | 2.09 GB | Recommended. Uses F16 for output and embedding, and Q5_K_M for everything else. High quality. |
| Q6_K | 1.56 GB | Recommended. Very high quality. |
| Q6_K_L | 1.7 GB | Recommended. Uses Q8_0 for output and embedding, and Q6_K for everything else. Very high quality. |
| Q6_K_XL | 2.24 GB | Recommended. Uses F16 for output and embedding, and Q6_K for everything else. Very high quality. |
| Q8_0 | 2.02 GB | Recommended. Quality almost like F16. |
| Q8_K_XL | 2.56 GB | Recommended. Uses F16 for output and embedding, and Q8_0 for everything else. Quality almost like F16. |
| F16 | 3.79 GB | Not recommended. Overkill. Prefer Q8_0. |
| ORIGINAL (BF16) | 3.79 GB | Not recommended. Overkill. Prefer Q8_0. |
Quantized using TAO71-AI AutoQuantizer. You can check out the original model card here.