--- language: - en license: apache-2.0 library_name: gguf tags: - reranker - gguf - llama.cpp base_model: mixedbread-ai/mxbai-rerank-large-v2 --- # mxbai-rerank-large-v2-F16-GGUF This model was converted to GGUF format from [mixedbread-ai/mxbai-rerank-large-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the [original model card](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) for more details on the model. ## Model Information - **Base Model**: [mixedbread-ai/mxbai-rerank-large-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) - **Quantization**: F16 - **Format**: GGUF (GPT-Generated Unified Format) - **Converted with**: llama.cpp ## Quantization Details This is a **F16** quantization of the original model: - **F16**: Full 16-bit floating point - highest quality, largest size - **Q8_0**: 8-bit quantization - high quality, good balance - **Q4_K_M**: 4-bit quantization with medium quality - smaller size, faster inference ## Usage This model can be used with llama.cpp and other GGUF-compatible inference engines. ```bash # Example using llama.cpp ./llama-rerank -m mxbai-rerank-large-v2-F16.gguf ``` ## Model Files | Quantization | Use Case | |-------------|----------| | F16 | Maximum quality, largest size | | Q8_0 | High quality, good balance of size/performance | | Q4_K_M | Good quality, smallest size, fastest inference | ## Citation If you use this model, please cite the original model: ```bibtex # See original model card for citation information ``` ## License This model inherits the license from the original model. Please refer to the [original model card](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) for license details. ## Acknowledgements - Original model by the authors of [mixedbread-ai/mxbai-rerank-large-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2) - GGUF conversion via llama.cpp by ggml.ai - Converted and uploaded by [sinjab](https://huggingface.co/sinjab)