mc_danish
This model is a fine-tuned version of MediaCatch/xls-r-300m-danish-mc on the Preprocessed Dataset dataset. It achieves the following results on the evaluation set:
- Loss: 0.1570
- Wer: 0.0875
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 3
- eval_batch_size: 6
- seed: 69
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 4
- total_train_batch_size: 48
- total_eval_batch_size: 24
- optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 500
- num_epochs: 10.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| No log | 0 | 0 | 3.1518 | 0.1154 |
| 0.1233 | 0.0991 | 500 | 0.1684 | 0.1039 |
| 0.1122 | 0.1982 | 1000 | 0.1536 | 0.1059 |
| 0.109 | 0.2973 | 1500 | 0.1497 | 0.1032 |
| 0.1027 | 0.3965 | 2000 | 0.1471 | 0.1036 |
| 0.1082 | 0.4956 | 2500 | 0.1446 | 0.1053 |
| 0.105 | 0.5947 | 3000 | 0.1440 | 0.1059 |
| 0.0986 | 0.6938 | 3500 | 0.1436 | 0.1041 |
| 0.1037 | 0.7929 | 4000 | 0.1404 | 0.1015 |
| 0.0988 | 0.8920 | 4500 | 0.1391 | 0.1008 |
| 0.0979 | 0.9911 | 5000 | 0.1373 | 0.0995 |
| 0.0833 | 1.0902 | 5500 | 0.1385 | 0.0991 |
| 0.0855 | 1.1893 | 6000 | 0.1407 | 0.0990 |
| 0.0848 | 1.2884 | 6500 | 0.1390 | 0.0975 |
| 0.0845 | 1.3875 | 7000 | 0.1365 | 0.0986 |
| 0.0824 | 1.4866 | 7500 | 0.1373 | 0.0975 |
| 0.0842 | 1.5858 | 8000 | 0.1353 | 0.0979 |
| 0.0837 | 1.6849 | 8500 | 0.1334 | 0.0968 |
| 0.0823 | 1.7840 | 9000 | 0.1348 | 0.0960 |
| 0.0851 | 1.8831 | 9500 | 0.1331 | 0.0957 |
| 0.0843 | 1.9822 | 10000 | 0.1299 | 0.0949 |
| 0.0707 | 2.0813 | 10500 | 0.1337 | 0.0926 |
| 0.0709 | 2.1804 | 11000 | 0.1332 | 0.0958 |
| 0.0742 | 2.2795 | 11500 | 0.1316 | 0.0944 |
| 0.0742 | 2.3786 | 12000 | 0.1356 | 0.0959 |
| 0.0719 | 2.4777 | 12500 | 0.1323 | 0.0969 |
| 0.0739 | 2.5768 | 13000 | 0.1286 | 0.0951 |
| 0.0695 | 2.6760 | 13500 | 0.1315 | 0.0957 |
| 0.0741 | 2.7751 | 14000 | 0.1310 | 0.0940 |
| 0.0729 | 2.8742 | 14500 | 0.1303 | 0.0970 |
| 0.0695 | 2.9733 | 15000 | 0.1316 | 0.0939 |
| 0.0637 | 3.0724 | 15500 | 0.1353 | 0.0955 |
| 0.0664 | 3.1715 | 16000 | 0.1333 | 0.0940 |
| 0.0635 | 3.2706 | 16500 | 0.1370 | 0.0941 |
| 0.0652 | 3.3697 | 17000 | 0.1334 | 0.0937 |
| 0.0653 | 3.4688 | 17500 | 0.1320 | 0.0957 |
| 0.0654 | 3.5679 | 18000 | 0.1365 | 0.0938 |
| 0.0633 | 3.6670 | 18500 | 0.1363 | 0.0943 |
| 0.0642 | 3.7661 | 19000 | 0.1316 | 0.0926 |
| 0.0622 | 3.8653 | 19500 | 0.1282 | 0.0906 |
| 0.0653 | 3.9644 | 20000 | 0.1334 | 0.0904 |
| 0.0585 | 4.0634 | 20500 | 0.1363 | 0.0914 |
| 0.057 | 4.1625 | 21000 | 0.1334 | 0.0935 |
| 0.0591 | 4.2617 | 21500 | 0.1370 | 0.0914 |
| 0.0538 | 4.3608 | 22000 | 0.1357 | 0.0929 |
| 0.0586 | 4.4599 | 22500 | 0.1379 | 0.0916 |
| 0.0556 | 4.5590 | 23000 | 0.1378 | 0.0925 |
| 0.0574 | 4.6581 | 23500 | 0.1353 | 0.0898 |
| 0.0545 | 4.7572 | 24000 | 0.1371 | 0.0912 |
| 0.0572 | 4.8563 | 24500 | 0.1320 | 0.0895 |
| 0.0546 | 4.9554 | 25000 | 0.1361 | 0.0908 |
| 0.0485 | 5.0545 | 25500 | 0.1429 | 0.0926 |
| 0.054 | 5.1536 | 26000 | 0.1401 | 0.0912 |
| 0.0507 | 5.2527 | 26500 | 0.1406 | 0.0888 |
| 0.0519 | 5.3519 | 27000 | 0.1416 | 0.0902 |
| 0.0524 | 5.4510 | 27500 | 0.1403 | 0.0903 |
| 0.05 | 5.5501 | 28000 | 0.1395 | 0.0890 |
| 0.0503 | 5.6492 | 28500 | 0.1439 | 0.0892 |
| 0.0528 | 5.7483 | 29000 | 0.1402 | 0.0905 |
| 0.0503 | 5.8474 | 29500 | 0.1424 | 0.0902 |
| 0.051 | 5.9465 | 30000 | 0.1412 | 0.0890 |
| 0.0471 | 6.0456 | 30500 | 0.1447 | 0.0893 |
| 0.0461 | 6.1447 | 31000 | 0.1511 | 0.0885 |
| 0.0436 | 6.2438 | 31500 | 0.1505 | 0.0898 |
| 0.0483 | 6.3429 | 32000 | 0.1458 | 0.0884 |
| 0.0457 | 6.4420 | 32500 | 0.1449 | 0.0886 |
| 0.0465 | 6.5412 | 33000 | 0.1430 | 0.0880 |
| 0.0449 | 6.6403 | 33500 | 0.1487 | 0.0892 |
| 0.0455 | 6.7394 | 34000 | 0.1491 | 0.0883 |
| 0.0483 | 6.8385 | 34500 | 0.1476 | 0.0884 |
| 0.0485 | 6.9376 | 35000 | 0.1449 | 0.0885 |
| 0.0445 | 7.0367 | 35500 | 0.1504 | 0.0878 |
| 0.0429 | 7.1358 | 36000 | 0.1544 | 0.0887 |
| 0.0429 | 7.2349 | 36500 | 0.1507 | 0.0885 |
| 0.0449 | 7.3340 | 37000 | 0.1499 | 0.0890 |
| 0.0414 | 7.4331 | 37500 | 0.1522 | 0.0878 |
| 0.0414 | 7.5322 | 38000 | 0.1519 | 0.0888 |
| 0.0405 | 7.6313 | 38500 | 0.1540 | 0.0878 |
| 0.0424 | 7.7305 | 39000 | 0.1535 | 0.0884 |
| 0.0421 | 7.8296 | 39500 | 0.1533 | 0.0883 |
| 0.0418 | 7.9287 | 40000 | 0.1540 | 0.0884 |
| 0.0404 | 8.0278 | 40500 | 0.1537 | 0.0880 |
| 0.0412 | 8.1269 | 41000 | 0.1570 | 0.0875 |
| 0.0408 | 8.2260 | 41500 | 0.1569 | 0.0880 |
| 0.0408 | 8.3251 | 42000 | 0.1567 | 0.0878 |
| 0.039 | 8.4242 | 42500 | 0.1570 | 0.0881 |
| 0.0392 | 8.5233 | 43000 | 0.1559 | 0.0881 |
| 0.0424 | 8.6224 | 43500 | 0.1555 | 0.0887 |
| 0.0394 | 8.7215 | 44000 | 0.1572 | 0.0883 |
| 0.039 | 8.8207 | 44500 | 0.1581 | 0.0886 |
| 0.0398 | 8.9198 | 45000 | 0.1561 | 0.0880 |
| 0.0401 | 9.0188 | 45500 | 0.1565 | 0.0884 |
| 0.0393 | 9.1179 | 46000 | 0.1578 | 0.0881 |
| 0.04 | 9.2171 | 46500 | 0.1581 | 0.0878 |
| 0.0407 | 9.3162 | 47000 | 0.1579 | 0.0880 |
| 0.039 | 9.4153 | 47500 | 0.1582 | 0.0882 |
| 0.0388 | 9.5144 | 48000 | 0.1580 | 0.0883 |
| 0.0411 | 9.6135 | 48500 | 0.1580 | 0.0881 |
| 0.0403 | 9.7126 | 49000 | 0.1578 | 0.0880 |
| 0.0398 | 9.8117 | 49500 | 0.1578 | 0.0880 |
| 0.0377 | 9.9108 | 50000 | 0.1577 | 0.0880 |
Framework versions
- Transformers 4.56.2
- Pytorch 2.8.0+cu128
- Datasets 4.1.1
- Tokenizers 0.22.1
- Downloads last month
- 386
Model tree for MediaCatch/xls-r-300m-danish-mc-v2
Base model
MediaCatch/xls-r-300m-danish-mc