File size: 2,796 Bytes
d85d34d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b4ee5df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d85d34d
 
b4ee5df
d85d34d
 
 
 
 
 
 
 
 
b4ee5df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
---
language: en
license: apache-2.0
library_name: transformers
tags:
- llama
- vicuna
- eagle
- text-generation
pipeline_tag: text-generation
datasets:
- Aeala/ShareGPT_Vicuna_unfiltered
base_model:
- lmsys/vicuna-13b-v1.3
---
# Eagle-Vicuna-13B-v1.3

This is a fine-tuned version of Vicuna-13B using the EAGLE method for fast inference.

## Model Details

- **Base model**: [lmsys/vicuna-13b-v1.3](https://huggingface.co/lmsys/vicuna-13b-v1.3)
- **Method**: EAGLE (Efficient speculative decoding)
- **Training data**: ShareGPT, etc.



## 模型配置
base_model: lmsys/vicuna-13b-v1.3

eagle-model

```text
Model(
  (embed_tokens): Embedding(32000, 4096, padding_idx=0)
  (layers): ModuleList(
    (0): LlamaDecoderLayer(
      (self_attn): LlamaAttention(
        (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
        (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
        (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
        (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
        (rotary_emb): LlamaRotaryEmbedding()
      )
      (mlp): LlamaMLP(
        (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
        (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
        (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
        (act_fn): SiLU()
      )
      (post_attention_layernorm): LlamaRMSNorm()
    )
  )
  (fc): Linear(in_features=8192, out_features=4096, bias=True)
  (act): SiLU()
)
```

vicuna-13B-config.json

```json
{
  "architectures": [
    "LlamaForCausalLM"
  ],
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 5120,
  "initializer_range": 0.02,
  "intermediate_size": 13824,
  "max_position_embeddings": 2048,
  "model_type": "llama",
  "num_attention_heads": 40,
  "num_hidden_layers": 1,
  "pad_token_id": 0,
  "rms_norm_eps": 1e-06,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.28.1",
  "use_cache": true,
  "vocab_size": 32000
}


```
## 模型训练

### 数据生成
```bash
python -m eagle.ge_data.allocation --outdir ../data
```

### 训练

```bash
accelerate launch -m --mixed_precision=bf16 eagle.train.main --tmpdir eagle/data/sharegpt_0_67999_mufp16 --cpdir eagle/checkpoint --configpath eagle/train/vicuna_13B_config.json
```


### 模型上传



```py
from huggingface_hub import HfApi

api = HfApi()

# 只上传修改后的 README.md 文件
api.upload_file(
    path_or_fileobj="checkpoints/eagle-vicuna-13B/README.md",  # 本地修改后的 README 路径
    path_in_repo="README.md",                                   # 仓库中的目标路径(根目录)
    repo_id="Gavin1104/eagle-vicuna-13b-v1.3",
    repo_type="model"
)
```