---
language:
- en
license: apache-2.0
library_name: gguf
tags:
- reranker
- gguf
- llama.cpp
base_model: mixedbread-ai/mxbai-rerank-large-v2
---

# mxbai-rerank-large-v2-F16-GGUF

This model was converted to GGUF format from [mixedbread-ai/mxbai-rerank-large-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) using llama.cpp via the ggml.ai's GGUF-my-repo space.

Refer to the [original model card](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) for more details on the model.

## Model Information

- **Base Model**: [mixedbread-ai/mxbai-rerank-large-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2)
- **Quantization**: F16
- **Format**: GGUF (GPT-Generated Unified Format)
- **Converted with**: llama.cpp

## Quantization Details

This is a **F16** quantization of the original model:

- **F16**: Full 16-bit floating point - highest quality, largest size
- **Q8_0**: 8-bit quantization - high quality, good balance
- **Q4_K_M**: 4-bit quantization with medium quality - smaller size, faster inference

## Usage

This model can be used with llama.cpp and other GGUF-compatible inference engines.

```bash
# Example using llama.cpp
./llama-rerank -m mxbai-rerank-large-v2-F16.gguf
```

## Model Files

| Quantization | Use Case |
|-------------|----------|
| F16 | Maximum quality, largest size |
| Q8_0 | High quality, good balance of size/performance |
| Q4_K_M | Good quality, smallest size, fastest inference |

## Citation

If you use this model, please cite the original model:

```bibtex
# See original model card for citation information
```

## License

This model inherits the license from the original model. Please refer to the [original model card](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) for license details.

## Acknowledgements

- Original model by the authors of [mixedbread-ai/mxbai-rerank-large-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2)
- GGUF conversion via llama.cpp by ggml.ai
- Converted and uploaded by [sinjab](https://huggingface.co/sinjab)