Update README.md
Browse files
README.md
CHANGED
|
@@ -16,6 +16,18 @@ The model was trained on ~8 billion tokens.
|
|
| 16 |
- Extended Training: Further refinement of the model, resulting in improved benchmark performance and overall text generation quality.
|
| 17 |
- Tokenizer changes.
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
## How coherent is the 150M model?
|
| 20 |
Let's look at real-world examples:
|
| 21 |
|
|
@@ -132,18 +144,6 @@ The model shows some promise in understanding context related to simple requests
|
|
| 132 |
</tr>
|
| 133 |
</table>
|
| 134 |
|
| 135 |
-
## Chat format
|
| 136 |
-
|
| 137 |
-
This model uses a specific chat format for optimal performance.
|
| 138 |
-
```
|
| 139 |
-
<s>system
|
| 140 |
-
[System message]</s>
|
| 141 |
-
<s>user
|
| 142 |
-
[Your question or message]</s>
|
| 143 |
-
<s>assistant
|
| 144 |
-
[The model's response]</s>
|
| 145 |
-
```
|
| 146 |
-
|
| 147 |
## Usage with HuggingFace transformers
|
| 148 |
The model can be used with HuggingFace's `transformers` library:
|
| 149 |
```python
|
|
|
|
| 16 |
- Extended Training: Further refinement of the model, resulting in improved benchmark performance and overall text generation quality.
|
| 17 |
- Tokenizer changes.
|
| 18 |
|
| 19 |
+
## Chat format
|
| 20 |
+
|
| 21 |
+
This model is **very sensitive** to the chat template used. Ensure you use the correct template:
|
| 22 |
+
```
|
| 23 |
+
<s>system
|
| 24 |
+
[System message]</s>
|
| 25 |
+
<s>user
|
| 26 |
+
[Your question or message]</s>
|
| 27 |
+
<s>assistant
|
| 28 |
+
[The model's response]</s>
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
## How coherent is the 150M model?
|
| 32 |
Let's look at real-world examples:
|
| 33 |
|
|
|
|
| 144 |
</tr>
|
| 145 |
</table>
|
| 146 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
## Usage with HuggingFace transformers
|
| 148 |
The model can be used with HuggingFace's `transformers` library:
|
| 149 |
```python
|