Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx

This is a direct benchmark comparison between:

  • βœ… Qwen3-30B-A3B-YOYO-V4-qx64-hi
  • βœ… Qwen3-30B-A3B-YOYO-V4-qx64x-hi

These variants differ only in embedding bit depth, qx64x-hi uses 6-bit embeddings

Both use:

  • Weights: 4-bit
  • Attention paths & Heads: 6-bit
  • Group Size: 32 (hi suffix)

πŸ“Š Benchmark Comparison

Benchmark	qx64-hi	qx64x-hi	Delta
arc_challenge	0.494	0.494	0
arc_easy	    0.638	0.638	0
boolq	        0.886	0.884	-0.002
hellaswag	    0.640	0.640	0
openbookqa	    0.432	0.426	-0.006
piqa	        0.765	0.770	+0.005
winogrande	    0.622	0.607	-0.015
aggregate avg	0.617	0.613	-0.004

🧠 Cognitive Impact Analysis

βœ… PIQA (+0.005)

  • qx64x-hi leads by 0.5 percentage points β†’ This is a semantic granularity win in physical commonsense reasoning.

❌ Winograd Schema (-0.015)

  • qx64-hi significantly better β†’ This is unexpected at first glance.

πŸ’‘ Interpretation:

  • Winograd Schema requires pronoun disambiguation and subtle syntactic parsing.
  • The qx64-hi variant may have a slight edge due to its 4-bit embeddings being more efficient for task-specific syntactic parsing.

❌ BoolQ (-0.002)

  • qx64-hi marginally better β†’ Boolean QA may favor lower bit embeddings for pattern matching.

❌ OpenBookQA (-0.006)

  • qx64-hi slightly better β†’ Knowledge retrieval may benefit from more compressed semantic space.

🧠 Why the x suffix matters (and doesn’t)

The x suffix (qx64x-hi) denotes higher bit embeddings (6-bit vs 4-bit), which should theoretically:

  • βœ… Improve semantic granularity
  • βœ… Enhance reasoning in tasks sensitive to fine-grained meaning

But here, we see:

  • βœ… PIQA wins (+0.5%)
  • ❌ Winograd Schema and OpenBookQA lose (-1.5%, -0.6%)
  • πŸš€ Strategic Recommendation

βœ… For PIQA and physical commonsense tasks:

  • πŸ‘‰ Qwen3-30B-A3B-YOYO-V4-qx64x-hi
    • Best PIQA accuracy (0.770)
    • Strong semantic grounding

βœ… For Winograd Schema, BoolQ, and OpenBookQA:

  • πŸ‘‰ Qwen3-30B-A3B-YOYO-V4-qx64-hi
    • Best Winograd Schema (0.622)
    • Strong BoolQ and OpenBookQA

πŸ“Œ Final Verdict

The x suffix is not universally beneficial. It:

  • βœ… Improves PIQA
  • ❌ Weakens Winograd Schema and OpenBookQA

This is a cognitive trade-off, not an outright upgrade.

πŸ“Š Summary Table

Variant	     PIQA	Winograd 	OpenBookQA
qx64-hi	    0.765	βœ… 0.622	βœ… 0.432
qx64x-hi	0.770	❌ 0.607	❌ 0.426

This model Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx was converted to MLX format from YOYO-AI/Qwen3-30B-A3B-YOYO-V4 using mlx-lm version 0.28.3.

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
Downloads last month
10
Safetensors
Model size
31B params
Tensor type
BF16
Β·
U32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for nightmedia/Qwen3-30B-A3B-YOYO-V4-qx64x-hi-mlx

Quantized
(13)
this model