---
library_name: vllm
language:
- en
- fr
- es
- de
- it
- pt
- nl
- zh
- ja
- ko
- ar
license: apache-2.0
inference: false
extra_gated_description: >-
If you want to learn more about how we process your personal data, please read
our Privacy Policy.
tags:
- mistral-common
- transformers
---
# Ministral 3 3B Base 2512
The smallest model in the Ministral 3 family, **Ministral 3 3B** is a powerful, efficient tiny language model with vision capabilities.
This model is the base pre-trained version, not fine-tuned for instruction or reasoning tasks, making it ideal for custom post-training processes.
For instruction and chat based use cases, we recommend using [Ministral 3 3B Instruct 2512](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512).
The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, fitting in 16GB of VRAM in BF16, and less than 8GB of RAM/VRAM when quantized.
## Key Features
Ministral 3 3B consists of two main architectural components:
- **3.4B Language Model**
- **0.4B Vision Encoder**
The Ministral 3 3B Base model offers the following capabilities:
- **Vision**: Enables the model to analyze images and provide insights based on visual content, in addition to text.
- **Multilingual**: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
- **Edge-Optimized**: Delivers best-in-class performance at a small scale, deployable anywhere.
- **Apache 2.0 License**: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
- **Large Context Window**: Supports a 256k context window.
### Use Cases
Ideal for lightweight, real-time applications on edge or low-resource devices, such as:
- Image captioning
- Text classification
- Real-time efficient translation
- Data extraction
- Short content generation
- Fine-tuning and specialization
- And more...
Bringing advanced AI capabilities to edge and distributed environments for embedded systems.
## Ministral 3 Family
| Model Name | Type | Precision | Link |
|--------------------------------|--------------------|-----------|------------------------------------------------------------------------------------------|
| **Ministral 3 3B Base 2512** | **Base pre-trained** | **BF16** | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Base-2512) |
| Ministral 3 3B Instruct 2512 | Instruct post-trained | FP8 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512) |
| Ministral 3 3B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Reasoning-2512) |
| Ministral 3 8B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Base-2512) |
| Ministral 3 8B Instruct 2512 | Instruct post-trained | FP8 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512) |
| Ministral 3 8B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512) |
| Ministral 3 14B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Base-2512) |
| Ministral 3 14B Instruct 2512 | Instruct post-trained | FP8 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Instruct-2512) |
| Ministral 3 14B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512) |
Other formats available [here](https://huggingface.co/collections/mistralai/ministral-3-additional-checkpoints).
## Benchmark Results
We compare Ministral 3 to similar sized models.
### Reasoning
| Model | AIME25 | AIME24 | GPQA Diamond | LiveCodeBench |
|---------------------------|-------------|-------------|--------------|---------------|
| **Ministral 3 14B** | 0.850| 0.898| 0.712 | 0.646 |
| Qwen3-14B (Thinking) | 0.737 | 0.837 | 0.663 | 0.593 |
| | | | | |
| **Ministral 3 8B** | 0.787 | 0.860| 0.668 | 0.616 |
| Qwen3-VL-8B-Thinking | 0.798| 0.860| 0.671 | 0.580 |
| | | | | |
| **Ministral 3 3B** | 0.721| 0.775| 0.534 | 0.548 |
| Qwen3-VL-4B-Thinking | 0.697 | 0.729 | 0.601 | 0.513 |
### Instruct
| Model | Arena Hard | WildBench | MATH Maj@1 | MM MTBench |
|---------------------------|-------------|------------|-------------|------------------|
| **Ministral 3 14B** | 0.551| 68.5| 0.904| 8.49 |
| Qwen3 14B (Non-Thinking) | 0.427 | 65.1 | 0.870 | NOT MULTIMODAL |
| Gemma3-12B-Instruct | 0.436 | 63.2 | 0.854 | 6.70 |
| | | | | |
| **Ministral 3 8B** | 0.509 | 66.8| 0.876 | 8.08 |
| Qwen3-VL-8B-Instruct | 0.528| 66.3 | 0.946| 8.00 |
| | | | | |
| **Ministral 3 3B** | 0.305 | 56.8| 0.830 | 7.83 |
| Qwen3-VL-4B-Instruct | 0.438| 56.8| 0.900| 8.01 |
| Qwen3-VL-2B-Instruct | 0.163 | 42.2 | 0.786 | 6.36 |
| Gemma3-4B-Instruct | 0.318 | 49.1 | 0.759 | 5.23 |
### Base
| Model | Multilingual MMLU | MATH CoT 2-Shot | AGIEval 5-shot | MMLU Redux 5-shot | MMLU 5-shot | TriviaQA 5-shot |
|---------------------|-------------------|-----------------|----------------|-------------------|-------------|-----------------|
| **Ministral 3 14B** | 0.742 | 0.676 | 0.648 | 0.820 | 0.794 | 0.749 |
| Qwen3 14B Base | 0.754 | 0.620 | 0.661 | 0.837 | 0.804| 0.703 |
| Gemma 3 12B Base | 0.690 | 0.487 | 0.587 | 0.766 | 0.745 | 0.788 |
| | | | | | | |
| **Ministral 3 8B** | 0.706 | 0.626 | 0.591 | 0.793 | 0.761| 0.681 |
| Qwen 3 8B Base | 0.700 | 0.576 | 0.596 | 0.794 | 0.760 | 0.639 |
| | | | | | | |
| **Ministral 3 3B** | 0.652 | 0.601 | 0.511 | 0.735 | 0.707 | 0.592 |
| Qwen 3 4B Base | 0.677 | 0.405 | 0.570 | 0.759 | 0.713| 0.530 |
| Gemma 3 4B Base | 0.516 | 0.294 | 0.430 | 0.626 | 0.589 | 0.640 |
## Usage
The model can be used with the following frameworks;
- [`vllm`](https://github.com/vllm-project/vllm): See [here](#vllm)
- [`transformers`](https://github.com/huggingface/transformers): See [here](#transformers)
### vLLM
We recommend using this model with [vLLM](https://github.com/vllm-project/vllm).
#### Installation
Make sure to install most recent vllm:
```
uv pip install -U vllm \
--torch-backend=auto \
--extra-index-url https://wheels.vllm.ai/nightly
```
Doing so should automatically install [`mistral_common >= 1.8.6`](https://github.com/mistralai/mistral-common/releases/tag/v1.8.6).
To check:
```
python -c "import mistral_common; print(mistral_common.__version__)"
```
You can also make use of a ready-to-go [docker image](https://github.com/vllm-project/vllm/tree/main/docker) or on the [docker hub](https://hub.docker.com/layers/vllm/vllm-openai/latest/images).
#### Serve
Due to their size and the BF16 format of their weights `Ministral-3-3B-Base-2512` and `Ministral-3-8B-Base-2512` can run on a single 1xH200 GPU.
A simple launch command is:
```bash
vllm serve mistralai/Ministral-3-3B-Base-2512 \
--tokenizer_mode mistral --config_format mistral --load_format mistral
```
Additional flags:
* You can set `--max-model-len` to preserve memory. By default it is set to `262144` which is quite large but not necessary for most scenarios.
* You can set `--max-num-batched-tokens` to balance throughput and latency, higher means higher throughput but higher latency.
#### Usage of the model
Here we asumme that the model `mistralai/Ministral-3-3B-Base-2512` is served and you can ping it to the domain `localhost` with the port `8000` which is the default for vLLM.
Test Base
Quick test with the base model.
```python
from openai import OpenAI
# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"
TEMP = 0.15
MAX_TOK = 256
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
models = client.models.list()
model = models.data[0].id
response = client.completions.create(
model=model,
prompt="What is the best thing in the universe ?",
temperature=TEMP,
max_tokens=MAX_TOK,
)
print(response.choices[0].text)
```
### Transformers
You can also use Ministral 3 3B Base 2512 with `Transformers` !
Make sure to install `Transformers` from its first v5 release candidate or from "main":
```
pip install transformers==5.0.0rc0
```
To make the best use of our model with `Transformers` make sure to have [installed](https://github.com/mistralai/mistral-common) `mistral-common >= 1.8.6` to use our tokenizer.
```bash
pip install mistral-common --upgrade
```
Then load our tokenizer along with the model and generate:
Python snippet
```python
from transformers import Mistral3ForConditionalGeneration, MistralCommonBackend, FineGrainedFP8Config
model_id = "mistralai/Ministral-3-3B-Base-2512"
model = Mistral3ForConditionalGeneration.from_pretrained(
model_id,
device_map="auto",
)
tokenizer = MistralCommonBackend.from_pretrained(model_id)
input_ids = tokenizer.encode("Once about a time, France was a", return_tensors="pt")
input_ids = input_ids.to("cuda")
output = model.generate(
input_ids,
max_new_tokens=30,
)[0]
decoded_output = tokenizer.decode(output[len(input_ids[0]):])
print(decoded_output)
```
## License
This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
*You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.*