Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,8 @@
|
|
| 1 |
---
|
| 2 |
base_model:
|
| 3 |
- huihui-ai/DeepSeek-R1-Distill-Llama-70B-abliterated
|
| 4 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
base_model:
|
| 3 |
- huihui-ai/DeepSeek-R1-Distill-Llama-70B-abliterated
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
Needed to run a 4-bit quantization on vLLM but only GGUFs were available.
|
| 7 |
+
|
| 8 |
+
Loading time went from ~9 minutes to 2.5 minutes. Throughput went from 25 tokens/second to 45 tokens/second.
|