End of training
Browse files
README.md
CHANGED
|
@@ -44,18 +44,28 @@ The following hyperparameters were used during training:
|
|
| 44 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 45 |
- lr_scheduler_type: cosine
|
| 46 |
- lr_scheduler_warmup_steps: 1000
|
| 47 |
-
- training_steps:
|
| 48 |
- mixed_precision_training: Native AMP
|
| 49 |
|
| 50 |
### Training results
|
| 51 |
|
| 52 |
-
| Training Loss | Epoch | Step
|
| 53 |
-
|
| 54 |
-
| 3.1172 | 0.0356 | 1000
|
| 55 |
-
| 3.2603 | 0.0711 | 2000
|
| 56 |
-
| 3.222 | 0.1067 | 3000
|
| 57 |
-
| 3.1457 | 0.1422 | 4000
|
| 58 |
-
| 3.0929 | 0.1778 | 5000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
|
| 61 |
### Framework versions
|
|
|
|
| 44 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 45 |
- lr_scheduler_type: cosine
|
| 46 |
- lr_scheduler_warmup_steps: 1000
|
| 47 |
+
- training_steps: 15000
|
| 48 |
- mixed_precision_training: Native AMP
|
| 49 |
|
| 50 |
### Training results
|
| 51 |
|
| 52 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
| 53 |
+
|:-------------:|:------:|:-----:|:---------------:|
|
| 54 |
+
| 3.1172 | 0.0356 | 1000 | nan |
|
| 55 |
+
| 3.2603 | 0.0711 | 2000 | nan |
|
| 56 |
+
| 3.222 | 0.1067 | 3000 | nan |
|
| 57 |
+
| 3.1457 | 0.1422 | 4000 | nan |
|
| 58 |
+
| 3.0929 | 0.1778 | 5000 | nan |
|
| 59 |
+
| 3.1864 | 0.2133 | 6000 | nan |
|
| 60 |
+
| 3.1887 | 0.2489 | 7000 | nan |
|
| 61 |
+
| 3.162 | 0.2844 | 8000 | nan |
|
| 62 |
+
| 3.1355 | 0.32 | 9000 | nan |
|
| 63 |
+
| 3.1201 | 0.3556 | 10000 | nan |
|
| 64 |
+
| 3.0831 | 0.3911 | 11000 | nan |
|
| 65 |
+
| 3.0724 | 0.4267 | 12000 | nan |
|
| 66 |
+
| 3.0465 | 0.4622 | 13000 | nan |
|
| 67 |
+
| 3.0446 | 0.4978 | 14000 | nan |
|
| 68 |
+
| 3.0422 | 0.5333 | 15000 | nan |
|
| 69 |
|
| 70 |
|
| 71 |
### Framework versions
|