Commit
·
a94f6a8
1
Parent(s):
514e85c
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,4 +10,24 @@ tags:
|
|
| 10 |
---
|
| 11 |
This is an fp16 copy of [jarradh/llama2_70b_chat_uncensored](https://huggingface.co/jarradh/llama2_70b_chat_uncensored) for faster downloading and less disk space usage than the fp32 original. I simply imported the model to CPU with torch_dtype=torch.float16 and then exported it again. All credit for the model goes to [jarradh](https://huggingface.co/jarradh).
|
| 12 |
|
| 13 |
-
Arguable a better name for this model would be something like Llama-2-70B_Wizard-Vicuna-Uncensored-fp16, but to avoid confusion I'm sticking with jarradh's naming scheme.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
This is an fp16 copy of [jarradh/llama2_70b_chat_uncensored](https://huggingface.co/jarradh/llama2_70b_chat_uncensored) for faster downloading and less disk space usage than the fp32 original. I simply imported the model to CPU with torch_dtype=torch.float16 and then exported it again. All credit for the model goes to [jarradh](https://huggingface.co/jarradh).
|
| 12 |
|
| 13 |
+
Arguable a better name for this model would be something like Llama-2-70B_Wizard-Vicuna-Uncensored-fp16, but to avoid confusion I'm sticking with jarradh's naming scheme.
|
| 14 |
+
|
| 15 |
+
<!-- repositories-available start -->
|
| 16 |
+
## Repositories available
|
| 17 |
+
|
| 18 |
+
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GPTQ)
|
| 19 |
+
* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GGML)
|
| 20 |
+
* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference, plus fp16 GGUF for requantizing](https://huggingface.co/TheBloke/YokaiKoibito/WizardLM-Uncensored-Falcon-40B-GGUF)
|
| 21 |
+
* [Jarrad Hope's unquantised model in fp16 pytorch format, for GPU inference and further conversions](https://huggingface.co/YokaiKoibito/llama2_70b_chat_uncensored-fp16)
|
| 22 |
+
* [Jarrad Hope's original unquantised fp32 model in pytorch format, for further conversions](https://huggingface.co/jarradh/llama2_70b_chat_uncensored)
|
| 23 |
+
|
| 24 |
+
<!-- repositories-available end -->
|
| 25 |
+
|
| 26 |
+
## Prompt template: Human-Response
|
| 27 |
+
|
| 28 |
+
```
|
| 29 |
+
### HUMAN:
|
| 30 |
+
{prompt}
|
| 31 |
+
|
| 32 |
+
### RESPONSE:
|
| 33 |
+
```
|