ποΈ MarianMT English β Atlasic Tamazight (Tachelhit / Central Atlas Tamazight)
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ber that translates from English β Atlasic Tamazight (Tachelhit/Central Atlas Tamazight).
π Model Overview
| Property | Description |
|---|---|
| Base Model | Helsinki-NLP/opus-mt-en-ber |
| Architecture | MarianMT |
| Languages | English β Tamazight (Tachelhit / Central Atlas Tamazight) |
| Fine-tuning Dataset | 169K medium-quality synthetic sentence pairs generated by translating English corpora |
| Training Objective | Sequence-to-sequence translation fine-tuning |
| Framework | π€ Transformers |
| Tokenizer | SentencePiece |
π§ Training Details
| Hyperparameter | Value |
|---|---|
per_device_train_batch_size |
16 |
per_device_eval_batch_size |
48 |
learning_rate |
2e-5 |
num_train_epochs |
8 |
max_length |
128 |
num_beams |
5 |
eval_steps |
5000 |
save_steps |
5000 |
generation_no_repeat_ngram_size |
3 |
generation_repetition_penalty |
1.5 |
Training Environment:
- 1 Γ NVIDIA P100 (16 GB) on Kaggle
- Total training time: 6 h 33 m 60 s
π Evaluation Results
| Step | Train Loss | Val Loss | BLEU |
|---|---|---|---|
| 5000 | 0.4258 | 0.4082 | 2.01 |
| 10000 | 0.3694 | 0.3511 | 6.09 |
| 15000 | 0.3419 | 0.3232 | 7.83 |
| 20000 | 0.3148 | 0.3054 | 8.44 |
| 25000 | 0.2965 | 0.2923 | 9.79 |
| 30000 | 0.2895 | 0.2824 | 10.19 |
| 35000 | 0.2755 | 0.2756 | 11.26 |
| 40000 | 0.2733 | 0.2691 | 11.75 |
| 45000 | 0.2623 | 0.2649 | 12.26 |
| 50000 | 0.2581 | 0.2598 | 12.64 |
| 55000 | 0.2490 | 0.2567 | 12.83 |
| 60000 | 0.2520 | 0.2539 | 13.47 |
| 65000 | 0.2428 | 0.2518 | 13.60 |
| 70000 | 0.2376 | 0.2500 | 13.77 |
| 75000 | 0.2376 | 0.2488 | 13.87 |
| 80000 | 0.2362 | 0.2479 | 13.96 |
π Practical BLEU Evaluation Results
β£β Beam size = 5
β£β No-repeat n-gram size = 3
β£β Repetition penalty = 1.5
ββ BLEU Score = 17.903
π¬ Example Translations
| English | Atlasic Tamazight |
|---|---|
| I will go to school. | Rad ftuΙ£ s tinml. |
| What did you say? | Mad tnnit? |
| I'm not talking to you, I'm talking to him! | Ur ar gis sawalΙ£, ar ak sawalΙ£! |
| Everyone has a secret face. | Kraygatt yan ila waαΈ₯dut. |
Hugging Face Space:
π ilyasaqit/English-Tamazight-Translator
πͺΆ Notes
- The dataset is synthetic, not manually verified.
- The model performs best on short and simple general-domain sentences.
- Recommended decoding parameters:
num_beams=5repetition_penalty=1.2β1.5no_repeat_ngram_size=3
π Citation
If you use this model, please cite:
@misc{marian-en-tamazight-2025,
title = {MarianMT English β Atlasic Tamazight (Tachelhit / Central Atlas)},
year = {2025},
url = {https://huggingface.co/ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv}
}
- Downloads last month
- 161
Model tree for ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv
Base model
Helsinki-NLP/opus-mt-en-ber