nllb-600M-medical-luganda-bidirectional

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.5663
Bleu: 6.6618
Chrf: 22.9805

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 12

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Chrf
5.65	0.9434	400	5.0490	1.802	15.6619
4.5322	1.8868	800	4.4031	1.3741	15.75
3.8473	2.8302	1200	3.8525	1.2316	16.2379
3.3184	3.7736	1600	3.4441	1.2876	17.0497
3.0563	4.7170	2000	3.2368	1.5506	17.5027
2.7198	5.6604	2400	3.0402	2.255	18.2851
2.5405	6.6038	2800	2.9127	3.5638	19.6118
2.4426	7.5472	3200	2.8274	4.2756	20.7453
2.2637	8.4906	3600	2.6992	5.5678	21.8649
2.2013	9.4340	4000	2.6395	6.2028	22.4425
2.1493	10.3774	4400	2.6026	6.5708	22.799
2.1372	11.3208	4800	2.5663	6.6618	22.9805

Framework versions

PEFT 0.17.1
Transformers 4.56.2
Pytorch 2.8.0
Datasets 4.1.1
Tokenizers 0.22.1

Downloads last month: 44

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KMayanja/nllb-600M-medical-luganda-bidirectional

Base model

facebook/nllb-200-distilled-600M

Adapter

(38)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard