f02aefd7b91f5c9231a170bc6a587c91

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2741
  • Data Size: 1.0
  • Epoch Runtime: 18.2376
  • Bleu: 1.1010

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.4425 0 2.3490 0.1512
No log 1 70 3.4154 0.0078 2.3909 0.1512
No log 2 140 3.3242 0.0156 2.7178 0.1499
No log 3 210 3.3208 0.0312 3.3120 0.1561
No log 4 280 3.2790 0.0625 4.0068 0.1891
No log 5 350 3.0620 0.125 5.3587 0.1589
No log 6 420 2.9118 0.25 7.7588 0.3948
0.4887 7 490 2.7663 0.5 11.0609 0.5381
2.8782 8.0 560 2.6579 1.0 19.8240 0.4605
2.7819 9.0 630 2.5914 1.0 18.3191 0.5575
2.6843 10.0 700 2.5362 1.0 18.6385 0.5353
2.6346 11.0 770 2.4998 1.0 18.7627 0.5979
2.5855 12.0 840 2.4674 1.0 18.1962 0.6150
2.5201 13.0 910 2.4381 1.0 18.0755 0.6466
2.4707 14.0 980 2.4090 1.0 17.3000 0.7040
2.4341 15.0 1050 2.3954 1.0 18.4257 0.7404
2.4106 16.0 1120 2.3701 1.0 19.0556 0.7352
2.3603 17.0 1190 2.3497 1.0 17.9710 0.7189
2.3126 18.0 1260 2.3398 1.0 18.4536 0.7632
2.2971 19.0 1330 2.3232 1.0 17.4169 0.7931
2.2549 20.0 1400 2.3168 1.0 18.2895 0.8166
2.2321 21.0 1470 2.3120 1.0 18.2168 0.8283
2.1912 22.0 1540 2.3055 1.0 17.4534 0.8203
2.1623 23.0 1610 2.2961 1.0 17.8610 0.9116
2.1378 24.0 1680 2.2850 1.0 17.7668 0.8915
2.1166 25.0 1750 2.2783 1.0 18.6735 0.8682
2.0762 26.0 1820 2.2735 1.0 18.3668 0.9008
2.0812 27.0 1890 2.2746 1.0 18.6053 0.8955
2.0384 28.0 1960 2.2715 1.0 17.9728 0.9002
2.0101 29.0 2030 2.2714 1.0 17.3267 0.8992
1.9769 30.0 2100 2.2724 1.0 18.2268 0.9390
1.9509 31.0 2170 2.2617 1.0 19.5187 0.9829
1.9419 32.0 2240 2.2671 1.0 19.2548 1.0183
1.9153 33.0 2310 2.2717 1.0 17.9231 0.9995
1.8914 34.0 2380 2.2734 1.0 18.3847 1.0418
1.8725 35.0 2450 2.2741 1.0 18.2376 1.1010

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
14
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/f02aefd7b91f5c9231a170bc6a587c91

Base model

google-t5/t5-base
Finetuned
(709)
this model