xiaohei66 commited on
Commit
5752c20
·
verified ·
1 Parent(s): 4a039c8

PaddleOCR-VL-0.9B is now officially supported on vLLM

Browse files
Files changed (1) hide show
  1. README.md +23 -11
README.md CHANGED
@@ -72,9 +72,10 @@ PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vi
72
 
73
 
74
  ## News
75
- * ```2025.10.16``` 🚀 We release [PaddleOCR-VL](https://github.com/PaddlePaddle/PaddleOCR), — a multilingual documents parsing via a 0.9B Ultra-Compact Vision-Language Model with SOTA performance.
76
- * ```2025.10.29``` Supports calling the core module PaddleOCR-VL-0.9B of PaddleOCR-VL via the `transformers` library.
77
 
 
 
 
78
 
79
  ## Usage
80
 
@@ -113,15 +114,25 @@ for res in output:
113
 
114
  ### Accelerate VLM Inference via Optimized Inference Servers
115
 
116
- 1. Start the VLM inference server (the default port is `8080`):
117
 
118
- ```bash
119
- docker run \
120
- --rm \
121
- --gpus all \
122
- --network host \
123
- ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddlex-genai-vllm-server
124
- ```
 
 
 
 
 
 
 
 
 
 
125
  2. Call the PaddleOCR CLI or Python API:
126
 
127
  ```bash
@@ -130,6 +141,7 @@ for res in output:
130
  --vl_rec_backend vllm-server \
131
  --vl_rec_server_url http://127.0.0.1:8080/v1
132
  ```
 
133
  ```python
134
  from paddleocr import PaddleOCRVL
135
  pipeline = PaddleOCRVL(vl_rec_backend="vllm-server", vl_rec_server_url="http://127.0.0.1:8080/v1")
@@ -346,4 +358,4 @@ If you find PaddleOCR-VL helpful, feel free to give us a star and citation.
346
  primaryClass={cs.CV},
347
  url={https://arxiv.org/abs/2510.14528},
348
  }
349
- ```
 
72
 
73
 
74
  ## News
 
 
75
 
76
+ * ```2025.11.04``` 🥳 PaddleOCR-VL-0.9B is now officially supported on `vLLM`.
77
+ * ```2025.10.29``` Supports calling the core module PaddleOCR-VL-0.9B of PaddleOCR-VL via the `transformers` library.
78
+ * ```2025.10.16``` 🚀 We release [PaddleOCR-VL](https://github.com/PaddlePaddle/PaddleOCR), — a multilingual documents parsing via a 0.9B Ultra-Compact Vision-Language Model with SOTA performance.
79
 
80
  ## Usage
81
 
 
114
 
115
  ### Accelerate VLM Inference via Optimized Inference Servers
116
 
117
+ 1. Start the VLM inference server:
118
 
119
+ You can start the vLLM inference service using one of two methods:
120
+
121
+ - Method1: PaddleOCR method
122
+
123
+ ```bash
124
+ docker run \
125
+ --rm \
126
+ --gpus all \
127
+ --network host \
128
+ ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-genai-vllm-server:latest \
129
+ paddleocr genai_server --model_name PaddleOCR-VL-0.9B --host 0.0.0.0 --port 8080 --backend vllm
130
+ ```
131
+
132
+ - Method2: vLLM method
133
+
134
+ [vLLM: PaddleOCR-VL Usage Guide](https://docs.vllm.ai/projects/recipes/en/latest/PaddlePaddle/PaddleOCR-VL.html)
135
+
136
  2. Call the PaddleOCR CLI or Python API:
137
 
138
  ```bash
 
141
  --vl_rec_backend vllm-server \
142
  --vl_rec_server_url http://127.0.0.1:8080/v1
143
  ```
144
+
145
  ```python
146
  from paddleocr import PaddleOCRVL
147
  pipeline = PaddleOCRVL(vl_rec_backend="vllm-server", vl_rec_server_url="http://127.0.0.1:8080/v1")
 
358
  primaryClass={cs.CV},
359
  url={https://arxiv.org/abs/2510.14528},
360
  }
361
+ ```