DeepSeek OCR

Note currently only NexaSDK supports this model's GGUF.

Quickstart

Install NexaSDK
Run the model locally with one line of code:
```
nexa infer NexaAI/DeepSeek-OCR-GGUF
```
Then drag your image to terminal or type into the image path

case 1 : extract text

<your-image-path> Free OCR.

case 2 : extract bounding box

<your-image-path> <|grounding|>Convert the document to markdown.

Note: If the model fails to run, install the latest Vulkan driver for Windows

Model Description

DeepSeek OCR is a high-accuracy optical character recognition model built for extracting text from complex visual inputs such as documents, screenshots, receipts, and natural scenes.
It combines vision-language modeling with efficient visual encoders to achieve superior recognition of multi-language and multi-layout text while remaining lightweight enough for edge or on-device deployment.

Features

Multilingual OCR — recognizes printed and handwritten text across major global languages.
Document Layout Understanding — preserves structure such as tables, paragraphs, and titles.
Scene Text Recognition — robust against lighting, distortion, and low-quality captures.
Lightweight & Fast — optimized for CPU and GPU acceleration.
End-to-End Pipeline — supports image-to-text and structured JSON output.

Use Cases

Digitizing scanned documents or PDFs
Extracting text from mobile camera inputs or screenshots
Invoice and receipt parsing
OCR-based search and indexing systems
Visual question answering or document agents

Inputs and Outputs

Input:

Image file (JPEG, PNG, or tensor array)
Optional parameters for language hints or layout detection

Output:

Extracted text (plain text or structured format with bounding boxes)
Confidence scores per word or region

Integration

DeepSeek OCR can be integrated through:

Python API (pip install deepseek-ocr)
REST or gRPC endpoints for server deployment

License

This model is released under the Apache 2.0 License, allowing commercial use, modification, and redistribution with attribution.

Downloads last month: 40,635

GGUF

Model size

3B params

Architecture

deepseek_vl_v2

Hardware compatibility

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NexaAI/DeepSeek-OCR-GGUF

Base model

deepseek-ai/DeepSeek-OCR

Quantized

(6)

this model