DANN_CL_selfsupervised_JW

This model is a fine-tuned version of bustamiyusoef/NougatArabic_JawiAugment on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6904

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 10
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 6
  • total_train_batch_size: 60
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
5.8055 0.9917 80 0.9818
8.3174 1.9917 160 1.3511
5.4415 2.9917 240 0.9034
4.7409 3.9917 320 0.8253
4.521 4.9917 400 0.8176
4.1487 5.9917 480 0.7606
4.2873 6.9917 560 0.7819
4.1353 7.9917 640 0.7626
4.0225 8.9917 720 0.7288
3.9666 9.9917 800 0.7188
3.8694 10.9917 880 0.7080
3.8766 11.9917 960 0.7197
3.8285 12.9917 1040 0.7055
3.8199 13.9917 1120 0.7091
3.8439 14.9917 1200 0.7176
3.8258 15.9917 1280 0.7041
3.7933 16.9917 1360 0.6932
3.7643 17.9917 1440 0.6912
3.7467 18.9917 1520 0.6917
3.7627 19.9917 1600 0.6904

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 4.1.1
  • Tokenizers 0.21.0
Downloads last month
1
Safetensors
Model size
0.2B params
Tensor type
I64
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bustamiyusoef/DANN_CL_selfsupervised_JW

Finetuned
(1)
this model