--- language: - en - rto - lug - ach - nyn license: mit tags: - translation - african-languages - rutooro - luganda - acholi - runyankore datasets: - custom metrics: - bleu library_name: transformers pipeline_tag: translation base_model: Helsinki-NLP/opus-mt-en-mul widget: - text: ">>rutooro<< Education is important for community development." - text: ">>luganda<< Mobile phones have transformed communication in rural areas." - text: ">>acholi<< The market opens early in the morning." - text: ">>runyankore<< Women play a crucial role in community development." --- # Rutooro-Centric Multilingual Translation Model This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-mul](https://huggingface.co/Helsinki-NLP/opus-mt-en-mul) that specializes in translating from English to Rutooro and other East African languages. ## Model Description This translation model focuses specifically on Rutooro while maintaining high quality for other East African languages including Luganda, Acholi, and Runyankore. It was fine-tuned on a carefully curated dataset containing thousands of translation pairs across multiple languages, with special emphasis on rows where Rutooro translations were present. ## Supported Languages The model primarily supports translation from English to: - **Rutooro** (Ugandan language spoken by the Batooro people) - **Luganda** (Most widely spoken Ugandan language) - **Acholi** (Nilotic language spoken in Northern Uganda and South Sudan) - **Runyankore** (Language spoken in southwestern Uganda) Other languages from the base model may also work but with varying quality. ## Usage To use this model for translation: ```python from transformers import pipeline # Initialize the translation pipeline translator = pipeline("translation", model="MubarakB/rutooro-multilingual-translator") # Translate to Rutooro text = "Education is important for community development." rutooro_translation = translator(f">>rutooro<< {text}") print(f"Rutooro: {rutooro_translation[0]['translation_text']}") # Translate to other supported languages luganda_translation = translator(f">>luganda<< {text}") print(f"Luganda: {luganda_translation[0]['translation_text']}") acholi_translation = translator(f">>acholi<< {text}") print(f"Acholi: {acholi_translation[0]['translation_text']}") runyankore_translation = translator(f">>runyankore<< {text}") print(f"Runyankore: {runyankore_translation[0]['translation_text']}") ``` ### Language Tokens When using this model, you must prefix your input text with the appropriate language token: - `>>rutooro<<` - For Rutooro translation - `>>luganda<<` - For Luganda translation - `>>acholi<<` - For Acholi translation - `>>runyankore<<` - For Runyankore translation ## Example Translations | English | Rutooro | Luganda | Acholi | Runyankore | |---------|---------|---------|--------|------------| | Education is important for development. | Okusoma nikwomuhendo ahabw'okukulaakulana. | Okusoma kikulu nnyo mu nkulaakulana. | Kwan dongo pire me yubo lobo. | Okushoma nikukuru ahabw'okukulaakulana. | | Mobile phones have transformed communication in rural areas. | Esimu zabyemikono zihindwireho enkoragana omubicweka byakyaro. | Essimu ezitambulizibwa mu ngalo zikyusizza eby'empuliziganya mu byalo. | Simu latic me cing ocele kit me kwat lok i gang me tung. | Amasimu g'ebyemikono gakyusizza empuliziganya mu byalo. | | The market opens early in the morning. | Akatale kagurwaho kare omumakya. | Akatale kabbika mu makya. | Gang cuk yabedo labongo ikare me ice. | Akatale kakingirweho makya. | | Women play a crucial role in community development. | Abakazzi nibakora mulimo gwa mughaso ngu kukulakulanya ekyaro. | Abakazi balina ekifo ekikulu mu nkulaakulana y'eggwanga. | Mon ni tii tic ma kwako alokaloka me kom kin gang. | Abakazi bakola omulimu murungi mu nkulaakulana y'ekitundu. | ## Model Details - **Base Model:** Helsinki-NLP/opus-mt-en-mul - **Model Type:** Sequence-to-Sequence (Encoder-Decoder Transformer) - **Training Data:** Multilingual dataset with focus on Rutooro translations - **Fine-tuning:** Targeted fine-tuning with special emphasis on Rutooro language pairs - **Languages Coverage:** - Rutooro (11.75% of dataset) - Luganda (99.86% of dataset) - Acholi (99.87% of dataset) - Runyankore (99.87% of dataset) ## Limitations - The model is optimized for general conversational text and may not perform as well on highly specialized or technical content - Performance may vary based on language coverage in the training data - Quality can vary based on sentence complexity and domain - Some languages may benefit from additional fine-tuning with more domain-specific data ## Citation If you use this model in your research, please cite: ```bibtex @misc{rutooro-multilingual-translator, author = {Mubarak Bachu}, title = {Rutooro-Centric Multilingual Translation Model}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/MubarakB/rutooro-multilingual-translator}} } ``` ## Acknowledgments This model builds upon the excellent work by Helsinki-NLP and the Opus-MT project. Special thanks to the communities supporting the preservation and computational processing of East African languages.