🏔️ MarianMT English → Atlasic Tamazight (Tachelhit / Central Atlas Tamazight)

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ber that translates from English → Atlasic Tamazight (Tachelhit/Central Atlas Tamazight).

📘 Model Overview

Property	Description
Base Model	`Helsinki-NLP/opus-mt-en-ber`
Architecture	MarianMT
Languages	English → Tamazight (Tachelhit / Central Atlas Tamazight)
Fine-tuning Dataset	169K medium-quality synthetic sentence pairs generated by translating English corpora
Training Objective	Sequence-to-sequence translation fine-tuning
Framework	🤗 Transformers
Tokenizer	SentencePiece

🧠 Training Details

Hyperparameter	Value
`per_device_train_batch_size`	16
`per_device_eval_batch_size`	48
`learning_rate`	2e-5
`num_train_epochs`	8
`max_length`	128
`num_beams`	5
`eval_steps`	5000
`save_steps`	5000
`generation_no_repeat_ngram_size`	3
`generation_repetition_penalty`	1.5

Training Environment:
- 1 × NVIDIA P100 (16 GB) on Kaggle
- Total training time: 6 h 33 m 60 s

📈 Evaluation Results

Step	Train Loss	Val Loss	BLEU
5000	0.4258	0.4082	2.01
10000	0.3694	0.3511	6.09
15000	0.3419	0.3232	7.83
20000	0.3148	0.3054	8.44
25000	0.2965	0.2923	9.79
30000	0.2895	0.2824	10.19
35000	0.2755	0.2756	11.26
40000	0.2733	0.2691	11.75
45000	0.2623	0.2649	12.26
50000	0.2581	0.2598	12.64
55000	0.2490	0.2567	12.83
60000	0.2520	0.2539	13.47
65000	0.2428	0.2518	13.60
70000	0.2376	0.2500	13.77
75000	0.2376	0.2488	13.87
80000	0.2362	0.2479	13.96

🌍 Practical BLEU Evaluation Results

┣━ Beam size = 5
┣━ No-repeat n-gram size = 3
┣━ Repetition penalty = 1.5
┗━ BLEU Score = 17.903

💬 Example Translations

English	Atlasic Tamazight
I will go to school.	Rad ftuɣ s tinml.
What did you say?	Mad tnnit?
I'm not talking to you, I'm talking to him!	Ur ar gis sawalɣ, ar ak sawalɣ!
Everyone has a secret face.	Kraygatt yan ila waḥdut.

Hugging Face Space:
👉 ilyasaqit/English-Tamazight-Translator

🪶 Notes

The dataset is synthetic, not manually verified.
The model performs best on short and simple general-domain sentences.
Recommended decoding parameters:
- num_beams=5
- repetition_penalty=1.2–1.5
- no_repeat_ngram_size=3

📚 Citation

If you use this model, please cite:

@misc{marian-en-tamazight-2025,
  title  = {MarianMT English → Atlasic Tamazight (Tachelhit / Central Atlas)},
  year   = {2025},
  url    = {https://huggingface.co/ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv}
}

Downloads last month: 161

Safetensors

Model size

62.6M params

Tensor type

F32

Model tree for ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv

Base model

Helsinki-NLP/opus-mt-en-ber

Finetuned

(3)

this model

ilyasaqit
/

opus-mt-en-atlasic_tamazight-synth169k-nmv

🏔️ MarianMT English → Atlasic Tamazight (Tachelhit / Central Atlas Tamazight)

📘 Model Overview

🧠 Training Details

Training Environment:
- 1 × NVIDIA P100 (16 GB) on Kaggle
- Total training time: 6 h 33 m 60 s

📈 Evaluation Results

🌍 Practical BLEU Evaluation Results

💬 Example Translations

🪶 Notes

📚 Citation

Model tree for ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv

Space using ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv 1

🏔️ MarianMT English → Atlasic Tamazight (Tachelhit / Central Atlas Tamazight)

📘 Model Overview

🧠 Training Details

Training Environment:- 1 × NVIDIA P100 (16 GB) on Kaggle- Total training time: 6 h 33 m 60 s

📈 Evaluation Results

🌍 Practical BLEU Evaluation Results

💬 Example Translations

🪶 Notes

📚 Citation

Model tree for ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv

Space using ilyasaqit/opus-mt-en-atlasic_tamazight-synth169k-nmv 1

Training Environment:
- 1 × NVIDIA P100 (16 GB) on Kaggle
- Total training time: 6 h 33 m 60 s