DANN_CL_selfsupervised_JW

This model is a fine-tuned version of bustamiyusoef/NougatArabic_JawiAugment on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.6904

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 10
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 6
total_train_batch_size: 60
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss
5.8055	0.9917	80	0.9818
8.3174	1.9917	160	1.3511
5.4415	2.9917	240	0.9034
4.7409	3.9917	320	0.8253
4.521	4.9917	400	0.8176
4.1487	5.9917	480	0.7606
4.2873	6.9917	560	0.7819
4.1353	7.9917	640	0.7626
4.0225	8.9917	720	0.7288
3.9666	9.9917	800	0.7188
3.8694	10.9917	880	0.7080
3.8766	11.9917	960	0.7197
3.8285	12.9917	1040	0.7055
3.8199	13.9917	1120	0.7091
3.8439	14.9917	1200	0.7176
3.8258	15.9917	1280	0.7041
3.7933	16.9917	1360	0.6932
3.7643	17.9917	1440	0.6912
3.7467	18.9917	1520	0.6917
3.7627	19.9917	1600	0.6904

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu121
Datasets 4.1.1
Tokenizers 0.21.0

Downloads last month: 1

Safetensors

Model size

0.2B params

Tensor type

I64

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bustamiyusoef/DANN_CL_selfsupervised_JW

Base model

bustamiyusoef/NougatArabic_JawiAugment

Finetuned

(1)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard