b9153bd2be7f3ce1e2ddf14ca41ed9f3

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

Loss: 1.8543
Data Size: 1.0
Epoch Runtime: 22.6643
Bleu: 6.1993

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.8555	0	2.5592	0.3260
No log	1	87	3.8313	0.0078	3.1464	0.3386
No log	2	174	3.7292	0.0156	2.9788	0.3678
No log	3	261	3.6272	0.0312	3.5239	0.3074
No log	4	348	3.4884	0.0625	4.4046	0.4217
0.1569	5	435	3.3286	0.125	6.0111	0.6300
0.8524	6	522	3.1440	0.25	8.4529	0.9854
1.1574	7	609	2.9492	0.5	13.0708	1.3953
1.7694	8.0	696	2.7182	1.0	23.0007	1.9353
2.8404	9.0	783	2.5684	1.0	21.9491	2.1866
2.6798	10.0	870	2.4528	1.0	21.7347	2.6121
2.5514	11.0	957	2.3663	1.0	22.3504	2.8020
2.4704	12.0	1044	2.2948	1.0	22.0997	3.0231
2.368	13.0	1131	2.2348	1.0	22.6331	3.3599
2.2789	14.0	1218	2.1800	1.0	22.5807	3.7844
2.1823	15.0	1305	2.1474	1.0	22.5067	3.9405
2.1312	16.0	1392	2.1016	1.0	21.6881	4.1275
2.0687	17.0	1479	2.0765	1.0	22.8609	4.5020
2.0178	18.0	1566	2.0386	1.0	21.9435	4.5653
1.9657	19.0	1653	2.0131	1.0	22.3212	4.7538
1.913	20.0	1740	1.9926	1.0	22.8330	4.8370
1.8548	21.0	1827	1.9681	1.0	22.8307	5.1520
1.8268	22.0	1914	1.9552	1.0	22.1181	5.1588
1.7728	23.0	2001	1.9444	1.0	22.4836	5.1969
1.7147	24.0	2088	1.9215	1.0	22.6886	5.3652
1.7	25.0	2175	1.9059	1.0	22.8922	5.4540
1.6593	26.0	2262	1.9036	1.0	24.4123	5.4991
1.6265	27.0	2349	1.8935	1.0	22.2225	5.6249
1.5661	28.0	2436	1.8860	1.0	23.6774	5.6835
1.5536	29.0	2523	1.8807	1.0	22.6511	5.7394
1.5256	30.0	2610	1.8669	1.0	23.6931	5.7594
1.4802	31.0	2697	1.8775	1.0	23.1565	5.8550
1.4556	32.0	2784	1.8551	1.0	23.2581	5.8120
1.4115	33.0	2871	1.8563	1.0	23.3494	5.9032
1.3966	34.0	2958	1.8597	1.0	22.6237	5.9444
1.3943	35.0	3045	1.8580	1.0	23.7098	6.0622
1.3277	36.0	3132	1.8452	1.0	22.0464	6.1380
1.3022	37.0	3219	1.8406	1.0	22.6885	6.1072
1.2946	38.0	3306	1.8724	1.0	22.8054	6.2527
1.2564	39.0	3393	1.8583	1.0	22.2587	6.2054
1.2513	40.0	3480	1.8468	1.0	23.3702	6.2147
1.2395	41.0	3567	1.8543	1.0	22.6643	6.1993

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 12

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/b9153bd2be7f3ce1e2ddf14ca41ed9f3

Base model

google-t5/t5-base

Finetuned

(709)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard