PaddleOCR-VL-0.9B is now officially supported on vLLM
Browse files
README.md
CHANGED
|
@@ -72,9 +72,10 @@ PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vi
|
|
| 72 |
|
| 73 |
|
| 74 |
## News
|
| 75 |
-
* ```2025.10.16``` 🚀 We release [PaddleOCR-VL](https://github.com/PaddlePaddle/PaddleOCR), — a multilingual documents parsing via a 0.9B Ultra-Compact Vision-Language Model with SOTA performance.
|
| 76 |
-
* ```2025.10.29``` Supports calling the core module PaddleOCR-VL-0.9B of PaddleOCR-VL via the `transformers` library.
|
| 77 |
|
|
|
|
|
|
|
|
|
|
| 78 |
|
| 79 |
## Usage
|
| 80 |
|
|
@@ -113,15 +114,25 @@ for res in output:
|
|
| 113 |
|
| 114 |
### Accelerate VLM Inference via Optimized Inference Servers
|
| 115 |
|
| 116 |
-
1. Start the VLM inference server
|
| 117 |
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
2. Call the PaddleOCR CLI or Python API:
|
| 126 |
|
| 127 |
```bash
|
|
@@ -130,6 +141,7 @@ for res in output:
|
|
| 130 |
--vl_rec_backend vllm-server \
|
| 131 |
--vl_rec_server_url http://127.0.0.1:8080/v1
|
| 132 |
```
|
|
|
|
| 133 |
```python
|
| 134 |
from paddleocr import PaddleOCRVL
|
| 135 |
pipeline = PaddleOCRVL(vl_rec_backend="vllm-server", vl_rec_server_url="http://127.0.0.1:8080/v1")
|
|
@@ -346,4 +358,4 @@ If you find PaddleOCR-VL helpful, feel free to give us a star and citation.
|
|
| 346 |
primaryClass={cs.CV},
|
| 347 |
url={https://arxiv.org/abs/2510.14528},
|
| 348 |
}
|
| 349 |
-
```
|
|
|
|
| 72 |
|
| 73 |
|
| 74 |
## News
|
|
|
|
|
|
|
| 75 |
|
| 76 |
+
* ```2025.11.04``` 🥳 PaddleOCR-VL-0.9B is now officially supported on `vLLM`.
|
| 77 |
+
* ```2025.10.29``` Supports calling the core module PaddleOCR-VL-0.9B of PaddleOCR-VL via the `transformers` library.
|
| 78 |
+
* ```2025.10.16``` 🚀 We release [PaddleOCR-VL](https://github.com/PaddlePaddle/PaddleOCR), — a multilingual documents parsing via a 0.9B Ultra-Compact Vision-Language Model with SOTA performance.
|
| 79 |
|
| 80 |
## Usage
|
| 81 |
|
|
|
|
| 114 |
|
| 115 |
### Accelerate VLM Inference via Optimized Inference Servers
|
| 116 |
|
| 117 |
+
1. Start the VLM inference server:
|
| 118 |
|
| 119 |
+
You can start the vLLM inference service using one of two methods:
|
| 120 |
+
|
| 121 |
+
- Method1: PaddleOCR method
|
| 122 |
+
|
| 123 |
+
```bash
|
| 124 |
+
docker run \
|
| 125 |
+
--rm \
|
| 126 |
+
--gpus all \
|
| 127 |
+
--network host \
|
| 128 |
+
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-genai-vllm-server:latest \
|
| 129 |
+
paddleocr genai_server --model_name PaddleOCR-VL-0.9B --host 0.0.0.0 --port 8080 --backend vllm
|
| 130 |
+
```
|
| 131 |
+
|
| 132 |
+
- Method2: vLLM method
|
| 133 |
+
|
| 134 |
+
[vLLM: PaddleOCR-VL Usage Guide](https://docs.vllm.ai/projects/recipes/en/latest/PaddlePaddle/PaddleOCR-VL.html)
|
| 135 |
+
|
| 136 |
2. Call the PaddleOCR CLI or Python API:
|
| 137 |
|
| 138 |
```bash
|
|
|
|
| 141 |
--vl_rec_backend vllm-server \
|
| 142 |
--vl_rec_server_url http://127.0.0.1:8080/v1
|
| 143 |
```
|
| 144 |
+
|
| 145 |
```python
|
| 146 |
from paddleocr import PaddleOCRVL
|
| 147 |
pipeline = PaddleOCRVL(vl_rec_backend="vllm-server", vl_rec_server_url="http://127.0.0.1:8080/v1")
|
|
|
|
| 358 |
primaryClass={cs.CV},
|
| 359 |
url={https://arxiv.org/abs/2510.14528},
|
| 360 |
}
|
| 361 |
+
```
|