Mixtral-8x22B / README.md

mcuste

Update Conversion Artifacts and README with conversion numbers and links to documentation

f2941d0 12 days ago

3.02 kB

	---
	base_model:
	- mistralai/Mixtral-8x22B-v0.1
	base_model_relation: quantized
	pipeline_tag: text-generation
	tags:
	- quantized
	- hardware-optimized
	- mixtral
	- tensordyne
	license: apache-2.0
	---

	## 📝 Overview
	Tensordyne builds advanced [AI-inference systems](https://www.tensordyne.ai/inference-system), enabling faster, more affordable, and sustainable generative AI.

	This repository provides resources to quickly get started with [Mixtral-8x22B](https://huggingface.co/mistralai/Mixtral-8x22B-v0.1) on the Tensordyne Inference System and its SDK.

	## 🧩 Model Details
	- Quantization: post-training quantization of the base model, no fine-tuning or additional training was performed
	- Supported data types: Tensordyne FP16 (tFP16), Tensordyne FP8 (tFP8), mixed-precision

	## ⚙️ Quantization
	The Tensordyne SDK offers multiple post-training quantization strategies to convert AI models for efficient inference on the Tensordyne Inference System — fully customizable for your optimization targets.
	We showcase several preselected quantization variants that can be applied on-the-fly to quantize to Tensordyne data types here. The calibration-based strategies are defined by quantization configurations provided as `.json`.

	The quantized models are evaluated on 10% of the [WikiText-2 raw v1](https://huggingface.co/datasets/Salesforce/wikitext) test set. Negative relative perplexity drops indicate that the model performs better than the float base model.

	\| Model Configuration \| Absolute Perplexity \| Relative Perplexity Drop vs. BF16 \| Details \|
	\|----------------------------------\|---------------------\|-----------------------------------\|-------------------------------------------------------------\|
	\| BF16 \| 2.923 \| – \| The baseline model trained in BF16 \|
	\| calibration_free_tFP16 \| 2.921 \| -0.05 % \| calibration-free tFP16 quantization \|
	\| calibration_based_tFP16 \| 2.923 \| 0.00 % \| calibration-based tFP16 quantization \|
	\| layerwise_mixed_precision \| 2.932 \| 0.30 % \| calibration-based mixed-precision: tFP8, outliers in tFP16 \|
	\| calibration_free_dynamic_tFP8 \| 2.926 \| 0.13 % \| calibration-free tFP8 dynamic quantization \|

	## 🚀 Getting Started
	Refer to the [Tensordyne Hugging Face Hub tutorial](https://resources.tensordyne.ai/sdk/v0.1.1/tutorials/tutorials/#tensordyne-hugging-face-hub-tutorials) for instructions on using the artifacts provided in this repository.
	Our [hosted documentation](https://resources.tensordyne.ai/sdk/v0.1.1/) provides more information on Tensordyne's quantization strategies and introduces you to our SDK.