secretmoon commited on
Commit
57fa47a
·
verified ·
1 Parent(s): 246b5f9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -1
README.md CHANGED
@@ -1,6 +1,111 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
3
  language:
4
  - en
5
  pipeline_tag: text-generation
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ library_name: peft
4
+ base_model: Sao10K/L3-8B-Stheno-v3.1
5
  language:
6
  - en
7
  pipeline_tag: text-generation
8
+ ---
9
+
10
+ ## Overview
11
+
12
+ **Secretmoon/LoRA-Llama-3-MLP** is a LoRA adapter for the Llama-3-8B model, primarily designed to expand the model's knowledge of the MLP:FiM (My Little Pony: Friendship is Magic) universe. This adapter is ideal for generating fan fiction, role-playing scenarios, and other creative projects. The training data includes factual content from the Fandom wiki and canonical fan works that deeply explore the universe.
13
+
14
+ ![Night alicorn](https://huggingface.co/secretmoon/LoRA-Llama-3-MLP/resolve/main/profile.png)
15
+
16
+ ## Base Model
17
+
18
+ The base model for this adapter is **Sao10K/L3-8B-Stheno-v3.1**, an excellent fine-tuned version of the original Llama-3-8B. It excels in story writing and role-playing without suffering from degradation due to overfitting.
19
+
20
+ ## Training Details
21
+
22
+ - **Dataset:**
23
+ 1. Cleaned copy of the MLP Fandom Wiki, excluding information about recent and side projects unrelated to MLP:FiM. (Alpaca)
24
+ 2. Approximately 100 specially selected fan stories from FiMFiction. (RAW text)
25
+ 3. Additional data to train the model as a personal assistant and enhance its sensitivity to user emotions. (Alpaca)
26
+ - **Training Duration:** 3 hours
27
+ - **Hardware:** 1 x NVIDIA RTX A6000 48GB
28
+ - **PEFT Type:** LoRA 8-bit
29
+ - **Sequence Length:** 6144
30
+ - **Batch Size:** 2
31
+ - **Num Epochs:** 3
32
+ - **Optimizer:** AdamW_BNB_8bit
33
+ - **Learning Rate Scheduler:** Cosine
34
+ - **Learning Rate:** 0.00033
35
+ - **LoRA R:** 256
36
+ - **Sample Packing:** True
37
+ - **LoRA Target Linear:** True
38
+
39
+ ## How to Use
40
+
41
+ You can apply the adapter to the original Safetensors weights of the model and load it through Transformers, or you can merge this adapter with the base model weights and convert it to f16 .gguf for use in llama.cpp.
42
+
43
+ ### Recommendations for LoRA Alpha
44
+
45
+ - **16:** Low influence
46
+ - **48:** Suggested optimal value (recommended)
47
+ - **64:** High influence, significantly impacting model behavior
48
+ - **128:** Very high influence, drastically changing language model behavior (not recommended)
49
+
50
+ You can modify this parameter in the `adapter_config.json` file. For example, I merged the adapter with the base model using LoRA alpha=40.
51
+
52
+ ```python
53
+ import torch
54
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
55
+ from peft import PeftModel
56
+
57
+ # Loading tokenizer
58
+ tokenizer = AutoTokenizer.from_pretrained("Sao10K/L3-8B-Stheno-v3.1")
59
+
60
+ # Load base model in fp16, if you have ~15gb VRAM at least
61
+ base_model = AutoModelForCausalLM.from_pretrained(
62
+ "Sao10K/L3-8B-Stheno-v3.1",
63
+ trust_remote_code=True,
64
+ device_map="auto",
65
+ torch_dtype=torch.float16, # optional if you have enough VRAM
66
+ )
67
+
68
+ # Loading LoRA
69
+ adapter_name = "secretmoon/LoRA-Llama-3-MLP"
70
+ model = PeftModel.from_pretrained(base_model, adapter_name)
71
+ model = model.eval()
72
+
73
+ # Text generation function
74
+ def generate_text(prompt, max_length=100, num_return_sequences=1):
75
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
76
+ outputs = model.generate(
77
+ **inputs,
78
+ max_length=max_length,
79
+ num_return_sequences=num_return_sequences,
80
+ no_repeat_ngram_size=2,
81
+ early_stopping=True
82
+ )
83
+ return [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]
84
+
85
+ prompt = "Once upon a time"
86
+ generated_texts = generate_text(prompt)
87
+ for i, text in enumerate(generated_texts):
88
+ print(f"Generated text {i+1}:\n{text}\n")
89
+ ```
90
+ Example output:
91
+ ```plaintext
92
+ Generated text 1:
93
+ Once upon a time, there was a young filly named Luna. She was the younger sister of a powerful princess named Celestia. Luna lived in a beautiful castle with her sister and their parents, the king and queen. The castle was surrounded by a lush, green forest, and it was always filled with the sounds of birds singing and animals playing.
94
+ ```
95
+
96
+ ## Merge:
97
+
98
+ 1. **Using Axolotl** (https://github.com/OpenAccess-AI-Collective/axolotl)
99
+ ```bash
100
+ python3 -m axolotl.cli.merge_lora lora.yml --lora_model_dir="./completed-model"
101
+ ```
102
+
103
+ 2. **Conversion to adapter for gguf in OLD llama.cpp**
104
+ ```bash
105
+ python3 convert-lora-to-ggml.py /path/to/lora/adapter
106
+ ```
107
+
108
+ ## Other:
109
+ <br> You can contact me on telegram @monstor86 or discord @starlight2288
110
+ <br> Also you can try some RP with this adapter for free in my bot on telegram @Luna_Pony_bot
111
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)