Update README.md
Browse files
README.md
CHANGED
|
@@ -35,7 +35,7 @@ There are three versions:
|
|
| 35 |
### Training Details
|
| 36 |
|
| 37 |
1) Training Epochs are calculated from the number of full iterations of all dataset and were set from the n_step parameter in the initialization of Trainer.
|
| 38 |
-
Finally, there are 1 for nano model, 1 for mini model,
|
| 39 |
|
| 40 |
2) Batch Size: 32 - for nano and mini. 64 - for small.
|
| 41 |
|
|
@@ -92,8 +92,6 @@ Loss:
|
|
| 92 |
|
| 93 |

|
| 94 |
|
| 95 |
-
Here is the neatly formatted Markdown table in English:
|
| 96 |
-
|
| 97 |
Epoch:
|
| 98 |
|
| 99 |
| Parameter | Min | Max | Cur |
|
|
@@ -117,8 +115,8 @@ in the `small` - `small`:
|
|
| 117 |
|
| 118 |
```python
|
| 119 |
# Small model
|
| 120 |
-
model_small = TransformerForCausalLM.from_pretrained("estnafinema0/
|
| 121 |
-
tokenizer = ByteLevelBPETokenizer.from_pretrained("estnafinema0/
|
| 122 |
```
|
| 123 |
|
| 124 |
To generate the examples with the initial prompt:
|
|
|
|
| 35 |
### Training Details
|
| 36 |
|
| 37 |
1) Training Epochs are calculated from the number of full iterations of all dataset and were set from the n_step parameter in the initialization of Trainer.
|
| 38 |
+
Finally, there are 1 for nano model, 1 for mini model, 6 for small model.
|
| 39 |
|
| 40 |
2) Batch Size: 32 - for nano and mini. 64 - for small.
|
| 41 |
|
|
|
|
| 92 |
|
| 93 |

|
| 94 |
|
|
|
|
|
|
|
| 95 |
Epoch:
|
| 96 |
|
| 97 |
| Parameter | Min | Max | Cur |
|
|
|
|
| 115 |
|
| 116 |
```python
|
| 117 |
# Small model
|
| 118 |
+
model_small = TransformerForCausalLM.from_pretrained("estnafinema0/russian-jokes-generator ", revision="small")
|
| 119 |
+
tokenizer = ByteLevelBPETokenizer.from_pretrained("estnafinema0/russian-jokes-generator ")
|
| 120 |
```
|
| 121 |
|
| 122 |
To generate the examples with the initial prompt:
|