f02aefd7b91f5c9231a170bc6a587c91

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

Loss: 2.2741
Data Size: 1.0
Epoch Runtime: 18.2376
Bleu: 1.1010

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.4425	0	2.3490	0.1512
No log	1	70	3.4154	0.0078	2.3909	0.1512
No log	2	140	3.3242	0.0156	2.7178	0.1499
No log	3	210	3.3208	0.0312	3.3120	0.1561
No log	4	280	3.2790	0.0625	4.0068	0.1891
No log	5	350	3.0620	0.125	5.3587	0.1589
No log	6	420	2.9118	0.25	7.7588	0.3948
0.4887	7	490	2.7663	0.5	11.0609	0.5381
2.8782	8.0	560	2.6579	1.0	19.8240	0.4605
2.7819	9.0	630	2.5914	1.0	18.3191	0.5575
2.6843	10.0	700	2.5362	1.0	18.6385	0.5353
2.6346	11.0	770	2.4998	1.0	18.7627	0.5979
2.5855	12.0	840	2.4674	1.0	18.1962	0.6150
2.5201	13.0	910	2.4381	1.0	18.0755	0.6466
2.4707	14.0	980	2.4090	1.0	17.3000	0.7040
2.4341	15.0	1050	2.3954	1.0	18.4257	0.7404
2.4106	16.0	1120	2.3701	1.0	19.0556	0.7352
2.3603	17.0	1190	2.3497	1.0	17.9710	0.7189
2.3126	18.0	1260	2.3398	1.0	18.4536	0.7632
2.2971	19.0	1330	2.3232	1.0	17.4169	0.7931
2.2549	20.0	1400	2.3168	1.0	18.2895	0.8166
2.2321	21.0	1470	2.3120	1.0	18.2168	0.8283
2.1912	22.0	1540	2.3055	1.0	17.4534	0.8203
2.1623	23.0	1610	2.2961	1.0	17.8610	0.9116
2.1378	24.0	1680	2.2850	1.0	17.7668	0.8915
2.1166	25.0	1750	2.2783	1.0	18.6735	0.8682
2.0762	26.0	1820	2.2735	1.0	18.3668	0.9008
2.0812	27.0	1890	2.2746	1.0	18.6053	0.8955
2.0384	28.0	1960	2.2715	1.0	17.9728	0.9002
2.0101	29.0	2030	2.2714	1.0	17.3267	0.8992
1.9769	30.0	2100	2.2724	1.0	18.2268	0.9390
1.9509	31.0	2170	2.2617	1.0	19.5187	0.9829
1.9419	32.0	2240	2.2671	1.0	19.2548	1.0183
1.9153	33.0	2310	2.2717	1.0	17.9231	0.9995
1.8914	34.0	2380	2.2734	1.0	18.3847	1.0418
1.8725	35.0	2450	2.2741	1.0	18.2376	1.1010

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 14

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f02aefd7b91f5c9231a170bc6a587c91

Base model

google-t5/t5-base

Finetuned

(709)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard