DeepSeek OCR
Note currently only NexaSDK supports this model's GGUF.
Quickstart
Install NexaSDK
Run the model locally with one line of code:
nexa infer NexaAI/DeepSeek-OCR-GGUFThen drag your image to terminal or type into the image path
case 1 : extract text
<your-image-path> Free OCR.
case 2 : extract bounding box
<your-image-path> <|grounding|>Convert the document to markdown.
Note: If the model fails to run, install the latest Vulkan driver for Windows
Model Description
DeepSeek OCR is a high-accuracy optical character recognition model built for extracting text from complex visual inputs such as documents, screenshots, receipts, and natural scenes.
It combines vision-language modeling with efficient visual encoders to achieve superior recognition of multi-language and multi-layout text while remaining lightweight enough for edge or on-device deployment.
Features
- Multilingual OCR β recognizes printed and handwritten text across major global languages.
- Document Layout Understanding β preserves structure such as tables, paragraphs, and titles.
- Scene Text Recognition β robust against lighting, distortion, and low-quality captures.
- Lightweight & Fast β optimized for CPU and GPU acceleration.
- End-to-End Pipeline β supports image-to-text and structured JSON output.
Use Cases
- Digitizing scanned documents or PDFs
- Extracting text from mobile camera inputs or screenshots
- Invoice and receipt parsing
- OCR-based search and indexing systems
- Visual question answering or document agents
Inputs and Outputs
Input:
- Image file (JPEG, PNG, or tensor array)
- Optional parameters for language hints or layout detection
Output:
- Extracted text (plain text or structured format with bounding boxes)
- Confidence scores per word or region
Integration
DeepSeek OCR can be integrated through:
- Python API (
pip install deepseek-ocr) - REST or gRPC endpoints for server deployment
License
This model is released under the Apache 2.0 License, allowing commercial use, modification, and redistribution with attribution.
- Downloads last month
- 40,635
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for NexaAI/DeepSeek-OCR-GGUF
Base model
deepseek-ai/DeepSeek-OCR