zheyishine commited on
Commit
5e5be73
·
verified ·
1 Parent(s): cb9a678

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -82,10 +82,10 @@ if __name__ == '__main__':
82
  #### Online Inference
83
  ```shell
84
  vllm serve inclusionAI/Ring-mini-linear-2.0-GPTQ-int4 \
85
- --tensor-parallel-size 2 \
86
  --pipeline-parallel-size 1 \
87
  --gpu-memory-utilization 0.90 \
88
- --max-num-seqs 512 \
89
  --no-enable-prefix-caching
90
  ```
91
 
 
82
  #### Online Inference
83
  ```shell
84
  vllm serve inclusionAI/Ring-mini-linear-2.0-GPTQ-int4 \
85
+ --tensor-parallel-size 1 \
86
  --pipeline-parallel-size 1 \
87
  --gpu-memory-utilization 0.90 \
88
+ --max-num-seqs 128 \
89
  --no-enable-prefix-caching
90
  ```
91