Text Generation
MLX
Safetensors
qwen3_moe
programming
code generation
code
codeqwen
Mixture of Experts
coding
coder
qwen2
chat
qwen
qwen-coder
Qwen3-Coder-30B-A3B-Instruct
Qwen3-30B-A3B
mixture of experts
128 experts
8 active experts
1 million context
qwen3
finetune
brainstorm 20x
brainstorm
optional thinking
unsloth
conversational
4-bit precision
| license: apache-2.0 | |
| library_name: mlx | |
| datasets: | |
| - DavidAU/ST-TheNextGeneration | |
| language: | |
| - en | |
| - fr | |
| - zh | |
| - de | |
| tags: | |
| - programming | |
| - code generation | |
| - code | |
| - codeqwen | |
| - moe | |
| - coding | |
| - coder | |
| - qwen2 | |
| - chat | |
| - qwen | |
| - qwen-coder | |
| - Qwen3-Coder-30B-A3B-Instruct | |
| - Qwen3-30B-A3B | |
| - mixture of experts | |
| - 128 experts | |
| - 8 active experts | |
| - 1 million context | |
| - qwen3 | |
| - finetune | |
| - brainstorm 20x | |
| - brainstorm | |
| - optional thinking | |
| - qwen3_moe | |
| - unsloth | |
| - mlx | |
| base_model: DavidAU/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III | |
| pipeline_tag: text-generation | |
| # Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III-mxfp4-mlx | |
| This quant is scheduled to be deleted due to the limits on accounts imposed by HuggingFace. | |
| There is nothing I can do but delete old models to make room. | |
| Please archive this model locally as I will not be able to upload a new one. | |
| You can still use the slightly larger [Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III qx86-hi-mlx](https://huggingface.co/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III-qx86-hi-mlx) instead, it performs much better, besides I am the only one being able to create the qx quants. | |
| If you want to create your own mxfp4 model from a source, you can use the mlx tools as follows: | |
| ```bash | |
| mlx_lm.convert --hf-path My/Model --mlx-path My-Model-mxfp4-mlx -q --q-bits 4 --q-group-size 32 --q-mode mxfp4 | |
| ``` | |
| And then you wait. You need a lot of RAM and patience. | |
| Sorry for the inconvenience. | |
| -G | |
| This model [Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III-mxfp4-mlx]a(https://huggingface.co/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III-mxfp4-mlx) was | |
| converted to MLX format from [DavidAU/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III](https://huggingface.co/DavidAU/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III) | |
| using mlx-lm version **0.28.2**. | |
| ## Use with mlx | |
| ```bash | |
| pip install mlx-lm | |
| ``` | |
| ```python | |
| from mlx_lm import load, generate | |
| model, tokenizer = load("Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III-mxfp4-mlx") | |
| prompt = "hello" | |
| if tokenizer.chat_template is not None: | |
| messages = [{"role": "user", "content": prompt}] | |
| prompt = tokenizer.apply_chat_template( | |
| messages, add_generation_prompt=True | |
| ) | |
| response = generate(model, tokenizer, prompt=prompt, verbose=True) | |
| ``` | |