nllb-600M-medical-luganda-bidirectional

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5663
  • Bleu: 6.6618
  • Chrf: 22.9805

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 12

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf
5.65 0.9434 400 5.0490 1.802 15.6619
4.5322 1.8868 800 4.4031 1.3741 15.75
3.8473 2.8302 1200 3.8525 1.2316 16.2379
3.3184 3.7736 1600 3.4441 1.2876 17.0497
3.0563 4.7170 2000 3.2368 1.5506 17.5027
2.7198 5.6604 2400 3.0402 2.255 18.2851
2.5405 6.6038 2800 2.9127 3.5638 19.6118
2.4426 7.5472 3200 2.8274 4.2756 20.7453
2.2637 8.4906 3600 2.6992 5.5678 21.8649
2.2013 9.4340 4000 2.6395 6.2028 22.4425
2.1493 10.3774 4400 2.6026 6.5708 22.799
2.1372 11.3208 4800 2.5663 6.6618 22.9805

Framework versions

  • PEFT 0.17.1
  • Transformers 4.56.2
  • Pytorch 2.8.0
  • Datasets 4.1.1
  • Tokenizers 0.22.1
Downloads last month
44
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for KMayanja/nllb-600M-medical-luganda-bidirectional

Adapter
(38)
this model