life2lang-pt
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.0482
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 22.1588 | 0.0078 | 50 | 19.2749 |
| 18.4364 | 0.0157 | 100 | 14.3397 |
| 11.7105 | 0.0235 | 150 | 4.3493 |
| 4.0853 | 0.0313 | 200 | 3.2481 |
| 3.1862 | 0.0392 | 250 | 2.5404 |
| 2.6147 | 0.0470 | 300 | 2.0131 |
| 2.1487 | 0.0548 | 350 | 1.8028 |
| 1.8747 | 0.0627 | 400 | 1.6388 |
| 1.64 | 0.0705 | 450 | 1.3478 |
| 1.4195 | 0.0783 | 500 | 1.2805 |
| 1.3282 | 0.0862 | 550 | 1.2760 |
| 1.2827 | 0.0940 | 600 | 1.2517 |
| 1.2636 | 0.1018 | 650 | 1.2455 |
| 1.2392 | 0.1097 | 700 | 1.2260 |
| 1.2323 | 0.1175 | 750 | 1.2223 |
| 1.2063 | 0.1253 | 800 | 1.2105 |
| 1.206 | 0.1332 | 850 | 1.2048 |
| 1.1913 | 0.1410 | 900 | 1.2007 |
| 1.1932 | 0.1488 | 950 | 1.1919 |
| 1.1795 | 0.1567 | 1000 | 1.1870 |
| 1.1716 | 0.1645 | 1050 | 1.1852 |
| 1.1745 | 0.1723 | 1100 | 1.1754 |
| 1.1641 | 0.1802 | 1150 | 1.1689 |
| 1.1497 | 0.1880 | 1200 | 1.1679 |
| 1.1526 | 0.1958 | 1250 | 1.1639 |
| 1.1494 | 0.2037 | 1300 | 1.1641 |
| 1.1527 | 0.2115 | 1350 | 1.1549 |
| 1.1327 | 0.2193 | 1400 | 1.1553 |
| 1.1262 | 0.2272 | 1450 | 1.1590 |
| 1.1299 | 0.2350 | 1500 | 1.1570 |
| 1.1255 | 0.2428 | 1550 | 1.1492 |
| 1.1257 | 0.2507 | 1600 | 1.1535 |
| 1.1174 | 0.2585 | 1650 | 1.1481 |
| 1.119 | 0.2663 | 1700 | 1.1283 |
| 1.1114 | 0.2742 | 1750 | 1.1421 |
| 1.1032 | 0.2820 | 1800 | 1.1262 |
| 1.1089 | 0.2898 | 1850 | 1.1282 |
| 1.1017 | 0.2977 | 1900 | 1.1130 |
| 1.1007 | 0.3055 | 1950 | 1.1164 |
| 1.0949 | 0.3133 | 2000 | 1.1174 |
| 1.0946 | 0.3212 | 2050 | 1.1306 |
| 1.0927 | 0.3290 | 2100 | 1.1103 |
| 1.0912 | 0.3368 | 2150 | 1.1406 |
| 1.0878 | 0.3447 | 2200 | 1.1136 |
| 1.0825 | 0.3525 | 2250 | 1.1036 |
| 1.0835 | 0.3603 | 2300 | 1.1054 |
| 1.0783 | 0.3682 | 2350 | 1.1127 |
| 1.0783 | 0.3760 | 2400 | 1.1087 |
| 1.0736 | 0.3838 | 2450 | 1.1034 |
| 1.0739 | 0.3917 | 2500 | 1.0999 |
| 1.0679 | 0.3995 | 2550 | 1.0904 |
| 1.0712 | 0.4073 | 2600 | 1.0927 |
| 1.063 | 0.4151 | 2650 | 1.0933 |
| 1.0689 | 0.4230 | 2700 | 1.0902 |
| 1.0659 | 0.4308 | 2750 | 1.0965 |
| 1.0616 | 0.4386 | 2800 | 1.0908 |
| 1.0635 | 0.4465 | 2850 | 1.0840 |
| 1.0614 | 0.4543 | 2900 | 1.0853 |
| 1.0632 | 0.4621 | 2950 | 1.0967 |
| 1.0585 | 0.4700 | 3000 | 1.0868 |
| 1.0494 | 0.4778 | 3050 | 1.0878 |
| 1.0548 | 0.4856 | 3100 | 1.0833 |
| 1.0495 | 0.4935 | 3150 | 1.0726 |
| 1.0565 | 0.5013 | 3200 | 1.0805 |
| 1.0446 | 0.5091 | 3250 | 1.0685 |
| 1.053 | 0.5170 | 3300 | 1.0700 |
| 1.0509 | 0.5248 | 3350 | 1.0794 |
| 1.0491 | 0.5326 | 3400 | 1.0678 |
| 1.0477 | 0.5405 | 3450 | 1.0679 |
| 1.0461 | 0.5483 | 3500 | 1.0716 |
| 1.0407 | 0.5561 | 3550 | 1.0737 |
| 1.0425 | 0.5640 | 3600 | 1.0628 |
| 1.0354 | 0.5718 | 3650 | 1.0670 |
| 1.0402 | 0.5796 | 3700 | 1.0671 |
| 1.0376 | 0.5875 | 3750 | 1.0685 |
| 1.0379 | 0.5953 | 3800 | 1.0673 |
| 1.0328 | 0.6031 | 3850 | 1.0579 |
| 1.0354 | 0.6110 | 3900 | 1.0622 |
| 1.0313 | 0.6188 | 3950 | 1.0621 |
| 1.0363 | 0.6266 | 4000 | 1.0590 |
| 1.0362 | 0.6345 | 4050 | 1.0653 |
| 1.0333 | 0.6423 | 4100 | 1.0593 |
| 1.0285 | 0.6501 | 4150 | 1.0578 |
| 1.0309 | 0.6580 | 4200 | 1.0581 |
| 1.0305 | 0.6658 | 4250 | 1.0561 |
| 1.026 | 0.6736 | 4300 | 1.0569 |
| 1.0315 | 0.6815 | 4350 | 1.0585 |
| 1.0317 | 0.6893 | 4400 | 1.0522 |
| 1.0348 | 0.6971 | 4450 | 1.0574 |
| 1.0293 | 0.7050 | 4500 | 1.0545 |
| 1.0283 | 0.7128 | 4550 | 1.0524 |
| 1.0274 | 0.7206 | 4600 | 1.0544 |
| 1.026 | 0.7285 | 4650 | 1.0525 |
| 1.0258 | 0.7363 | 4700 | 1.0541 |
| 1.0268 | 0.7441 | 4750 | 1.0580 |
| 1.0244 | 0.7520 | 4800 | 1.0484 |
| 1.0271 | 0.7598 | 4850 | 1.0508 |
| 1.0279 | 0.7676 | 4900 | 1.0508 |
| 1.0265 | 0.7755 | 4950 | 1.0478 |
| 1.0264 | 0.7833 | 5000 | 1.0520 |
| 1.0203 | 0.7911 | 5050 | 1.0496 |
| 1.026 | 0.7990 | 5100 | 1.0473 |
| 1.0227 | 0.8068 | 5150 | 1.0494 |
| 1.0227 | 0.8146 | 5200 | 1.0497 |
| 1.0228 | 0.8225 | 5250 | 1.0513 |
| 1.022 | 0.8303 | 5300 | 1.0521 |
| 1.022 | 0.8381 | 5350 | 1.0502 |
| 1.0259 | 0.8460 | 5400 | 1.0489 |
| 1.021 | 0.8538 | 5450 | 1.0504 |
| 1.0244 | 0.8616 | 5500 | 1.0475 |
| 1.025 | 0.8695 | 5550 | 1.0490 |
| 1.0197 | 0.8773 | 5600 | 1.0513 |
| 1.0224 | 0.8851 | 5650 | 1.0477 |
| 1.0189 | 0.8930 | 5700 | 1.0490 |
| 1.0227 | 0.9008 | 5750 | 1.0490 |
| 1.0182 | 0.9086 | 5800 | 1.0487 |
| 1.0201 | 0.9165 | 5850 | 1.0480 |
| 1.0239 | 0.9243 | 5900 | 1.0495 |
| 1.0182 | 0.9321 | 5950 | 1.0489 |
| 1.0224 | 0.9400 | 6000 | 1.0485 |
| 1.0251 | 0.9478 | 6050 | 1.0478 |
| 1.0223 | 0.9556 | 6100 | 1.0488 |
| 1.0232 | 0.9635 | 6150 | 1.0481 |
| 1.0217 | 0.9713 | 6200 | 1.0482 |
| 1.0194 | 0.9791 | 6250 | 1.0483 |
| 1.0208 | 0.9870 | 6300 | 1.0481 |
| 1.0219 | 0.9948 | 6350 | 1.0482 |
Framework versions
- Transformers 4.52.4
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.2
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support