Update README.md
Browse filesvllm model card update
README.md
CHANGED
|
@@ -109,12 +109,12 @@ print("result:", result)
|
|
| 109 |
|
| 110 |
```bash
|
| 111 |
# 80G * 16 GPU
|
| 112 |
-
vllm serve baidu/ERNIE-4.5-300B-A47B-Base-PT --
|
| 113 |
```
|
| 114 |
|
| 115 |
```bash
|
| 116 |
-
# FP8 online quantification 80G *
|
| 117 |
-
vllm serve baidu/ERNIE-4.5-300B-A47B-Base-PT --
|
| 118 |
```
|
| 119 |
|
| 120 |
## License
|
|
|
|
| 109 |
|
| 110 |
```bash
|
| 111 |
# 80G * 16 GPU
|
| 112 |
+
vllm serve baidu/ERNIE-4.5-300B-A47B-Base-PT --tensor-parallel-size 16
|
| 113 |
```
|
| 114 |
|
| 115 |
```bash
|
| 116 |
+
# FP8 online quantification 80G * 8 GPU
|
| 117 |
+
vllm serve baidu/ERNIE-4.5-300B-A47B-Base-PT --tensor-parallel-size 8 --quantization fp8
|
| 118 |
```
|
| 119 |
|
| 120 |
## License
|