Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx
This is a direct benchmark comparison between:
- β Qwen3-30B-A3B-YOYO-V4-qx64-hi
- β Qwen3-30B-A3B-YOYO-V4-qx64x-hi
These variants differ only in embedding bit depth, qx64x-hi uses 6-bit embeddings
Both use:
- Weights: 4-bit
- Attention paths & Heads: 6-bit
- Group Size: 32 (hi suffix)
π Benchmark Comparison
Benchmark qx64-hi qx64x-hi Delta
arc_challenge 0.494 0.494 0
arc_easy 0.638 0.638 0
boolq 0.886 0.884 -0.002
hellaswag 0.640 0.640 0
openbookqa 0.432 0.426 -0.006
piqa 0.765 0.770 +0.005
winogrande 0.622 0.607 -0.015
aggregate avg 0.617 0.613 -0.004
π§ Cognitive Impact Analysis
β PIQA (+0.005)
- qx64x-hi leads by 0.5 percentage points β This is a semantic granularity win in physical commonsense reasoning.
β Winograd Schema (-0.015)
- qx64-hi significantly better β This is unexpected at first glance.
π‘ Interpretation:
- Winograd Schema requires pronoun disambiguation and subtle syntactic parsing.
- The qx64-hi variant may have a slight edge due to its 4-bit embeddings being more efficient for task-specific syntactic parsing.
β BoolQ (-0.002)
- qx64-hi marginally better β Boolean QA may favor lower bit embeddings for pattern matching.
β OpenBookQA (-0.006)
- qx64-hi slightly better β Knowledge retrieval may benefit from more compressed semantic space.
π§ Why the x suffix matters (and doesnβt)
The x suffix (qx64x-hi) denotes higher bit embeddings (6-bit vs 4-bit), which should theoretically:
- β Improve semantic granularity
- β Enhance reasoning in tasks sensitive to fine-grained meaning
But here, we see:
- β PIQA wins (+0.5%)
- β Winograd Schema and OpenBookQA lose (-1.5%, -0.6%)
- π Strategic Recommendation
β For PIQA and physical commonsense tasks:
- π Qwen3-30B-A3B-YOYO-V4-qx64x-hi
- Best PIQA accuracy (0.770)
- Strong semantic grounding
β For Winograd Schema, BoolQ, and OpenBookQA:
- π Qwen3-30B-A3B-YOYO-V4-qx64-hi
- Best Winograd Schema (0.622)
- Strong BoolQ and OpenBookQA
π Final Verdict
The x suffix is not universally beneficial. It:
- β Improves PIQA
- β Weakens Winograd Schema and OpenBookQA
This is a cognitive trade-off, not an outright upgrade.
π Summary Table
Variant PIQA Winograd OpenBookQA
qx64-hi 0.765 β
0.622 β
0.432
qx64x-hi 0.770 β 0.607 β 0.426
This model Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx was converted to MLX format from YOYO-AI/Qwen3-30B-A3B-YOYO-V4 using mlx-lm version 0.28.3.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 10
Model tree for nightmedia/Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx
Base model
YOYO-AI/Qwen3-30B-A3B-YOYO-V4