taobao-mnn
/

Qwen3-30B-A3B-Thinking-2507-MNN

Text Generation

Model card Files Files and versions

Qwen3-30B-A3B-Thinking-2507-MNN / README.md

zhaode's picture

Update README.md

f543b78 verified 4 months ago

|

history blame contribute delete

1.29 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- chat
	base_model:
	- Qwen/Qwen3-30B-A3B-Thinking-2507
	base_model_relation: quantized
	---
	# Qwen3-30B-A3B-Thinking-2507-MNN

	## Introduction
	This model is a 4-bit quantized version of the MNN model exported from Qwen3-30B-A3B-Thinking-2507 using [llmexport](https://github.com/alibaba/MNN/tree/master/transformers/llm/export).

	## Download
	```bash
	# install huggingface
	pip install huggingface
	```
	```bash
	# shell download
	huggingface download --model 'taobao-mnn/Qwen3-30B-A3B-Thinking-2507-MNN' --local_dir 'path/to/dir'
	```
	```python
	# SDK download
	from huggingface_hub import snapshot_download
	model_dir = snapshot_download('taobao-mnn/Qwen3-30B-A3B-Thinking-2507-MNN')
	```

	```bash
	# git clone
	git clone https://www.modelscope.cn/taobao-mnn/Qwen3-30B-A3B-Thinking-2507-MNN
	```

	## Usage
	```bash
	# clone MNN source
	git clone https://github.com/alibaba/MNN.git

	# compile
	cd MNN
	mkdir build && cd build
	cmake .. -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true
	make -j

	# run
	./llm_demo /path/to/Qwen3-30B-A3B-Thinking-2507-MNN/config.json prompt.txt
	```

	## Document
	[MNN-LLM](https://mnn-docs.readthedocs.io/en/latest/transformers/llm.html#)