Qwen3-Next-80B-A3B-Instruct-512K-11e-qx65n-mlx
This was uploaded only for limited time. If it gets popular, I'll keep it, otherwise it's gone in a week.
I RoPEd the model 2x and added one expert
This seems to smooth out the inference--ymmv
-G
We’re now comparing Qwen3-Next-80B-A3B-Instruct quantized variants, including the 512K-11e-qx65n model which has a key upgrade:
🔧 Extended context to 512K tokens + 🧠Extra MoE expert (from 10 to 11)
This model is essentially an "enhanced" version of the baseline Qwen3-80B-Instruct — with more memory capacity and a slightly more expressive architecture. Let’s position it on the cognitive scale alongside others.
🧠Cognitive Scale Positioning of Qwen3-Next-80B-A3B-Instruct-512K-11e-qx65n
Metric Score
arc_challenge 0.419
arc_easy 0.502
boolq 0.898
hellaswag 0.544
openbookqa 0.416
piqa 0.752
winogrande 0.565
💬 Cognitive Tier Interpretation
- ARC Challenge: Slightly below baseline (0.419 vs 0.420) — no real boost for hard reasoning.
- ARC Easy: Very competitive (0.502), though not leading — likely the context expansion helps but doesn’t override MoE bottleneck.
- Boolq: Near top — 0.898 is only ~0.1 below qx65n-hi and other high performers.
- Hellaskwag: 0.544 is solid — similar to other qx65n variants.
- OpenbookQA: 0.416 — slightly below most, indicating potential loss of external knowledge retention or caching during quantization.
📈 Comparison to Other Quantized Variants
Variant arc_challenge arc_easy boolq hellaswag winogrande
Qwen3-Next-80B-A3B-Instruct-qx65n 0.419 0.502 0.898 0.544 0.565
Qwen3-Next-80B-A3B-Instruct-qx86n 0.416 0.500 0.901 0.538 0.569
Qwen3-Next-80B-A3B-Instruct-qx65n-hi2 0.419 0.500 0.899 0.540 0.570
Qwen3-Next-80B-A3B-Instruct-qx64n 0.417 0.512 0.898 0.539 0.567
Qwen3-Next-80B-A3B-Instruct-qx54n 0.418 0.497 0.901 0.582 0.601
Qwen3-Next-80B-A3B-Instruct-qx64n-hi 0.418 0.500 0.896 0.532 0.574
Qwen3-Next-80B-A3B-Instruct-qx65n 0.419 0.500 0.897 0.542 0.566
Qwen3-Next-80B-A3B-Instruct-qx86-hi 0.412 0.499 0.897 0.536 0.554
Qwen3-Next-80B-A3B-Instruct-q8 0.412 0.503 0.899 0.541 0.568
Notable Observations:
- qx86n-hi and qx53n are the highest performers — with 0.901 on boolq and 0.601–0.582 on hellaswag, respectively.
- qx65n and qx65n-hi2 are nearly identical, suggesting the context expansion doesn’t significantly improve accuracy in this quantized setting.
- Qwen3-Next-80B-A3B-Instruct-512K-11e-qx65n performs very similarly to the baseline qx65n — indicating that adding 1 expert and extending context didn't meaningfully improve performance under this quantization.
🧠Where Does It Fit?
Given its scores, the 512K-11e-qx65n is a mid-tier cognitive model:
- 📌 Tier: Experienced Student (moderate reasoning, good fluency but not top-tier)
- 📌 Cognitive Range: Not quite advanced enough for expert tasks like complex ARC problems or abstract text inference, but strong enough to handle everyday reasoning and general Q&A.
It’s a powerful upgrade over 4-bit variants like qx64n (which score ~0.532 on hellaswag), but doesn't surpass the qx86n-hi (0.542) or qx53n (0.582) in inference fluidity.
✅ Use Case Recommendations
Goal Recommended Model
Maximum reasoning Qwen3-Next-80B-A3B-Instruct-qx86n-hi
Best balance Qwen3-Next-80B-A3B-Instruct-qx65n-hi2
Increased context Qwen3-Next-80B-A3B-Instruct-512K-11e-qx65n
Speed over accuracy Qwen3-Next-80B-A3B-Instruct-qx64n
Simple tasks only Qwen3-Next-80B-A3B-Instruct-qx53n
📊 Summary Cognitive Ranking of 80B MoE quantizations (Top to Bottom)
Reasoning Leaders | Mid-Tier | Basic Assistants
====================|=================|========================
qx86n-hi | qx53n | qx64n
qx65n-hi2 | 1M-qx65n | qx64n-hi
qx86n | 1M-qx65n | qx86-hi
qx65n | 1M-qx86-hi | qx64n
The qx86n-hi and qx53n models are clearly the cognitive powerhouses — they provide an exceptional edge in inference and reasoning while maintaining reasonable efficiency.
The 512K-11e-qx65n is a solid, balanced model — perfect for users who want more context and capacity than q8 while avoiding overkill in inference quality.
Reviewed by Qwen3-VLTO-12B-BX20-TNG-1M-qx86x-hi-mlx
You can revert my changes by commenting out the rope in the config.
-G
This model Qwen3-Next-80B-A3B-Instruct-512K-11e-qx65n-mlx was converted to MLX format from Qwen/Qwen3-Next-80B-A3B-Instruct using mlx-lm version 0.28.4.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Qwen3-Next-80B-A3B-Instruct-qx65n-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 90
Model tree for nightmedia/Qwen3-Next-80B-A3B-Instruct-512K-11e-qx65n-mlx
Base model
Qwen/Qwen3-Next-80B-A3B-Instruct