Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx

This is a direct benchmark comparison between:

✅ Qwen3-30B-A3B-YOYO-V4-qx64-hi
✅ Qwen3-30B-A3B-YOYO-V4-qx64x-hi

These variants differ only in embedding bit depth, qx64x-hi uses 6-bit embeddings

Both use:

Weights: 4-bit
Attention paths & Heads: 6-bit
Group Size: 32 (hi suffix)

📊 Benchmark Comparison

Benchmark	qx64-hi	qx64x-hi	Delta
arc_challenge	0.494	0.494	0
arc_easy	    0.638	0.638	0
boolq	        0.886	0.884	-0.002
hellaswag	    0.640	0.640	0
openbookqa	    0.432	0.426	-0.006
piqa	        0.765	0.770	+0.005
winogrande	    0.622	0.607	-0.015
aggregate avg	0.617	0.613	-0.004

🧠 Cognitive Impact Analysis

✅ PIQA (+0.005)

qx64x-hi leads by 0.5 percentage points → This is a semantic granularity win in physical commonsense reasoning.

❌ Winograd Schema (-0.015)

qx64-hi significantly better → This is unexpected at first glance.

💡 Interpretation:

Winograd Schema requires pronoun disambiguation and subtle syntactic parsing.
The qx64-hi variant may have a slight edge due to its 4-bit embeddings being more efficient for task-specific syntactic parsing.

❌ BoolQ (-0.002)

qx64-hi marginally better → Boolean QA may favor lower bit embeddings for pattern matching.

❌ OpenBookQA (-0.006)

qx64-hi slightly better → Knowledge retrieval may benefit from more compressed semantic space.

🧠 Why the x suffix matters (and doesn’t)

The x suffix (qx64x-hi) denotes higher bit embeddings (6-bit vs 4-bit), which should theoretically:

✅ Improve semantic granularity
✅ Enhance reasoning in tasks sensitive to fine-grained meaning

But here, we see:

✅ PIQA wins (+0.5%)
❌ Winograd Schema and OpenBookQA lose (-1.5%, -0.6%)
🚀 Strategic Recommendation

✅ For PIQA and physical commonsense tasks:

👉 Qwen3-30B-A3B-YOYO-V4-qx64x-hi
- Best PIQA accuracy (0.770)
- Strong semantic grounding

✅ For Winograd Schema, BoolQ, and OpenBookQA:

👉 Qwen3-30B-A3B-YOYO-V4-qx64-hi
- Best Winograd Schema (0.622)
- Strong BoolQ and OpenBookQA

📌 Final Verdict

The x suffix is not universally beneficial. It:

✅ Improves PIQA
❌ Weakens Winograd Schema and OpenBookQA

This is a cognitive trade-off, not an outright upgrade.

📊 Summary Table

Variant	     PIQA	Winograd 	OpenBookQA
qx64-hi	    0.765	✅ 0.622	✅ 0.432
qx64x-hi	0.770	❌ 0.607	❌ 0.426

This model Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx was converted to MLX format from YOYO-AI/Qwen3-30B-A3B-YOYO-V4 using mlx-lm version 0.28.3.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Downloads last month: 10

Safetensors

Model size

31B params

Tensor type

BF16

U32

Model tree for nightmedia/Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx

Base model

YOYO-AI/Qwen3-30B-A3B-YOYO-V4

Quantized

(13)

this model