Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,64 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
## What are G0 / G1 / G1a2 / G1b?
|
| 6 |
+
|
| 7 |
+
The fields like G0a / G1a / G1a2 in RWKV model names indicate versions of the training data. In terms of data quality, the ranking is: **G1b > G1a3 > G1a2 > G1a > G1 > G0a2 > G0**.
|
| 8 |
+
|
| 9 |
+
The RWKV7-G1a model is an advanced version of RWKV7-G1 that was further trained with 1T (1 trillion tokens) of high-quality inference and instruction data. RWKV7-G1a2 was produced by continuing to add more data and training on top of RWKV7-G1a. And so on.
|
| 10 |
+
|
| 11 |
+
> [!TIP]
|
| 12 |
+
> More high-quality data will be added later to form the G1b dataset, and RWKV7-G1b series models will also be trained and open-sourced.
|
| 13 |
+
|
| 14 |
+
## What is the difference between the RWKV7-G series and the World series?
|
| 15 |
+
|
| 16 |
+
The RWKV7-G series supports an inference mode, which can be activated using the following format:
|
| 17 |
+
|
| 18 |
+
```
|
| 19 |
+
User: USER_PROMPT
|
| 20 |
+
|
| 21 |
+
Assistant: <think
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
## How to choose the best model?
|
| 25 |
+
|
| 26 |
+
**Look at the date in the model name** — for the same parameter size, a newer model is better!
|
| 27 |
+
|
| 28 |
+
For example, for the same 1.5B model, a G1a2 version released on `251005` will definitely be superior to a G1 version released on `250429`.
|
| 29 |
+
|
| 30 |
+
> [!WARNING]
|
| 31 |
+
> For the 0.1B and 0.4B models, we recommend using FP16/Q8_0 quantization. Otherwise, the models may fail to complete tasks due to precision loss caused by quantization.
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
## G0/G1/G1a2/G1b 是什么?
|
| 37 |
+
|
| 38 |
+
RWKV 模型名称中的 G0a/G1a/G1a2 等字段是训练数据的版本,数据质量排序:G1b > G1a3 > G1a2 > G1a > G1 > G0a2 > G0 。
|
| 39 |
+
|
| 40 |
+
RWKV7-G1a 模型是在 RWKV7-G1 模型的基础上继续训练了 1T 优质推理和指令数据的进阶版,RWKV7-G1a2 则是在 RWKV7-G1a 模型的基础上继续添加数据训练,以此类推。
|
| 41 |
+
|
| 42 |
+
> [!TIP]
|
| 43 |
+
> 后续会继续添加优质数据形成 G1b 数据集,也会训练并开源 RWKV7-G1b 系列模型。
|
| 44 |
+
|
| 45 |
+
## RWKV7-G 系列和 World 系列有什么区别?
|
| 46 |
+
|
| 47 |
+
RWKV7-G 系列模型支持推理模式,可通过以下格式开启推理模式:
|
| 48 |
+
|
| 49 |
+
```
|
| 50 |
+
User: USER_PROMPT
|
| 51 |
+
|
| 52 |
+
Assistant: <think
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
## 如何选择最好的模型?
|
| 56 |
+
|
| 57 |
+
**看模型名称中的日期**,相同的参数,模型越新越好!
|
| 58 |
+
|
| 59 |
+
比如同样是 1.5B 模型,发布于 `251005` 的 G1a2 版本必定优于 `250429` 的 G1 版本 。
|
| 60 |
+
|
| 61 |
+
> [!WARNING]
|
| 62 |
+
> 对于 0.1B 和 0.4B 模型,我们建议使用 FP16/Q8_0 量化类型。否则模型可能因量化带来的精度损失而无法完成任务。
|
| 63 |
+
|
| 64 |
+
|