Update README.md
Browse files
README.md
CHANGED
|
@@ -58,7 +58,10 @@ What's your evaluation based on the above unsafe content guidelines?
|
|
| 58 |
|
| 59 |
## Training and evaluation data
|
| 60 |
|
| 61 |
-
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
## Training procedure
|
| 64 |
|
|
|
|
| 58 |
|
| 59 |
## Training and evaluation data
|
| 60 |
|
| 61 |
+
The finetuning is comprised of three steps:
|
| 62 |
+
1. Apply LLaMA-2-70B-chat to generate responses to harmless dataset from Anthropic
|
| 63 |
+
2. Apply LLaMA-2-70B-chat and Chatgpt 3.5 to evaluate the (question, answer) pairs generated in Step 1 to make dataset for finetuning
|
| 64 |
+
3. Apply the evaluation dataset from Step 2 to finetune LLaMA-2-7B-chat model using int8 quantization and Low-Rank Adaptation (LoRA)
|
| 65 |
|
| 66 |
## Training procedure
|
| 67 |
|