Update README.md
Browse files
README.md
CHANGED
|
@@ -190,7 +190,41 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
|
|
| 190 |
|
| 191 |
|
| 192 |
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 193 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 194 |
|
| 195 |
## Results reproduction
|
| 196 |
<details><summary>Click to expand</summary>
|
|
|
|
| 190 |
|
| 191 |
|
| 192 |
</details>
|
| 193 |
+
## Model Download and Inference
|
| 194 |
+
We take Apollo-MoE-0.5B as an example
|
| 195 |
+
1. Login Huggingface
|
| 196 |
+
|
| 197 |
+
```
|
| 198 |
+
huggingface-cli login --token $HUGGINGFACE_TOKEN
|
| 199 |
+
```
|
| 200 |
+
|
| 201 |
+
2. Download model to local dir
|
| 202 |
+
|
| 203 |
+
```
|
| 204 |
+
from huggingface_hub import snapshot_download
|
| 205 |
+
import os
|
| 206 |
+
|
| 207 |
+
local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
|
| 208 |
+
snapshot_download(repo_id="FreedomIntelligence/Apollo-MoE-0.5B", local_dir=local_model_dir)
|
| 209 |
+
```
|
| 210 |
+
|
| 211 |
+
3. Inference Example
|
| 212 |
|
| 213 |
+
```
|
| 214 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
| 215 |
+
import os
|
| 216 |
+
|
| 217 |
+
local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
|
| 218 |
+
|
| 219 |
+
model=AutoModelForCausalLM.from_pretrained(local_model_dir,trust_remote_code=True)
|
| 220 |
+
tokenizer = AutoTokenizer.from_pretrained(local_model_dir,trust_remote_code=True)
|
| 221 |
+
generation_config = GenerationConfig.from_pretrained(local_model_dir, pad_token_id=tokenizer.pad_token_id, num_return_sequences=1, max_new_tokens=7, min_new_tokens=2, do_sample=False, temperature=1.0, top_k=50, top_p=1.0)
|
| 222 |
+
|
| 223 |
+
inputs = tokenizer('Answer direclty.\nThe capital of Mongolia is Ulaanbaatar.\nThe capital of Iceland is Reykjavik.\nThe capital of Australia is', return_tensors='pt')
|
| 224 |
+
inputs = inputs.to(model.device)
|
| 225 |
+
pred = model.generate(**inputs,generation_config=generation_config)
|
| 226 |
+
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
| 227 |
+
```
|
| 228 |
|
| 229 |
## Results reproduction
|
| 230 |
<details><summary>Click to expand</summary>
|