Update README.md
Browse files
README.md
CHANGED
|
@@ -82,10 +82,10 @@ if __name__ == '__main__':
|
|
| 82 |
#### Online Inference
|
| 83 |
```shell
|
| 84 |
vllm serve inclusionAI/Ring-mini-linear-2.0-GPTQ-int4 \
|
| 85 |
-
--tensor-parallel-size
|
| 86 |
--pipeline-parallel-size 1 \
|
| 87 |
--gpu-memory-utilization 0.90 \
|
| 88 |
-
--max-num-seqs
|
| 89 |
--no-enable-prefix-caching
|
| 90 |
```
|
| 91 |
|
|
|
|
| 82 |
#### Online Inference
|
| 83 |
```shell
|
| 84 |
vllm serve inclusionAI/Ring-mini-linear-2.0-GPTQ-int4 \
|
| 85 |
+
--tensor-parallel-size 1 \
|
| 86 |
--pipeline-parallel-size 1 \
|
| 87 |
--gpu-memory-utilization 0.90 \
|
| 88 |
+
--max-num-seqs 128 \
|
| 89 |
--no-enable-prefix-caching
|
| 90 |
```
|
| 91 |
|