Update README.md
Browse files
README.md
CHANGED
|
@@ -147,6 +147,13 @@ This model is most easily served with [OpenChat's](https://github.com/imoneoi/op
|
|
| 147 |
This is highly recommended as it is by far the fastest in terms of inference speed and is a quick and easy option for setup.
|
| 148 |
We also illustrate setup of Oobabooga/text-generation-webui below. The settings outlined there will also apply to other uses of `Transformers`.
|
| 149 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 150 |
|
| 151 |
## Serving with OpenChat
|
| 152 |
|
|
|
|
| 147 |
This is highly recommended as it is by far the fastest in terms of inference speed and is a quick and easy option for setup.
|
| 148 |
We also illustrate setup of Oobabooga/text-generation-webui below. The settings outlined there will also apply to other uses of `Transformers`.
|
| 149 |
|
| 150 |
+
## Serving Quantized
|
| 151 |
+
|
| 152 |
+
Pre-quantized models are now available courtesy of our friend TheBloke:
|
| 153 |
+
|
| 154 |
+
* **GGML**: https://huggingface.co/TheBloke/OpenOrcaxOpenChat-Preview2-13B-GGML
|
| 155 |
+
* **GPTQ**: https://huggingface.co/TheBloke/OpenOrcaxOpenChat-Preview2-13B-GPTQ
|
| 156 |
+
|
| 157 |
|
| 158 |
## Serving with OpenChat
|
| 159 |
|