Model Card for llama3.2-3b-Darija-Morocco-QA

This model is a fine-tuned version of Llama 3.2 3B, specifically optimized for answering questions in Darija (Moroccan Arabic). It leverages the Moroccan Wikipedia QA dataset for fine-tuning.

Model Details

Model Description

This model is designed to provide accurate and contextually relevant answers to questions posed in Darija, a dialect of Arabic spoken in Morocco. It has been fine-tuned using the Moroccan Wikipedia QA dataset to enhance its performance in this specific linguistic and cultural context.

  • Developed by: Achraf Abbaoui
  • Model type: Causal Language Model
  • Language(s) (NLP): Darija (Moroccan Arabic)
  • License: MIT License
  • Finetuned from model: meta-llama/Llama-3.2-3B-Instruct

Model Sources

Uses

Direct Use

This model can be used directly for generating answers to questions in Darija. It is particularly useful for applications that require understanding and generating text in Moroccan Arabic, such as chatbots, virtual assistants, and educational tools.

Downstream Use

The model can be fine-tuned further for specific tasks or integrated into larger applications that require natural language processing capabilities in Darija.

Out-of-Scope Use

This model is not intended for use in high-stakes decision-making scenarios or for generating offensive or harmful content. It should not be used for tasks that require understanding of languages other than Darija.

Bias, Risks, and Limitations

The model may exhibit biases present in the training data, which could lead to unfair or inaccurate responses. It is important to evaluate the model's outputs carefully and consider the context in which it is used.

Recommendations

Users should be aware of the potential biases and limitations of the model. It is recommended to use the model in conjunction with human oversight and to regularly evaluate its performance.

How to Get Started with the Model

Use the code below to get started with the model.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

base_model_id = "meta-llama/Llama-3.2-3B-Instruct"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config, 
    device_map="auto",
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(base_model_id, add_bos_token=True, trust_remote_code=True)

"""Now load the QLoRA adapter from the appropriate checkpoint directory, i.e. the best performing model checkpoint:"""

from peft import PeftModel

ft_model = PeftModel.from_pretrained(base_model, "AchrafABBAOUI/llama3.2-3b-Darija-Morocco-QA")

"""and run your inference!
"""

eval_prompt = (
    '### ุณุคุงู„:\nุดุญุงู„ ู…ู† ุฏูˆุงุฑ ูƒุงูŠู† ู ู…ุดูŠุฎุฉ ุฃูŠุช ุนุจุฏ ุงู„ู„ู‡ ู„ูŠ ููŠู‡ุง ุฃุฒูƒูˆุฑุŸ\n\n'
    '### ุณูŠุงู‚:\nุฃุฒูƒูˆุฑ ู‡ูˆู‘ ุฏูˆุงุฑ ู…ุฌู…ุน ูƒุงูŠู† ู ุฌู…ุงุนุฉ ุฃูŠุช ุนุจุฏ ุงู„ู„ู‡ุŒ ุฏุงุฆุฑุฉ ุฅุบุฑู…ุŒ ุฅู‚ู„ูŠู… ุชุงุฑูˆุฏุงู†ุชุŒ ุฌู‡ุฉ ุณูˆุณ ู…ุงุณุฉ ู ู„ู…ุบุฑูŠุจ. ู‡ุงุฏ ุฏู‘ูˆุงุฑ ูƒูŠู†ุชุงู…ูŠ ู„ ู…ุดูŠุฎุฉ ุฃูŠุช ุนุจุฏ ุงู„ู„ู‡ ู„ูŠ ูƒุชุถู… 15 ุฏ ุฏู‘ูˆุงูˆุฑ\n\n'
    '### ุฌูˆุงุจ:\n'
    )
model_input = tokenizer(eval_prompt, return_tensors="pt").to("cuda")

ft_model.eval()
with torch.no_grad():
    print(tokenizer.decode(ft_model.generate(**model_input, max_new_tokens=300)[0], skip_special_tokens=True))

If it does ask you to login to you huggingface account because access to "meta-llama/Llama-3.2-3B-Instruct" is restricted then past your HF token after this code:

from huggingface_hub import interpreter_login

interpreter_login()

Training Details

Training Data

The model was fine-tuned using the Moroccan Wikipedia QA dataset, which contains questions and answers in Darija.

Training Procedure

Preprocessing

The dataset was preprocessed to ensure consistent formatting and tokenization. The tokenizer was configured to pad on the left and add EOS and BOS tokens.

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Learning rate: 2.5e-5
  • Batch size: 64
  • Max steps: 500
  • Optimizer: paged_adamw_8bit

Speeds, Sizes, Times

The model was trained on a single GPU for approximately 2 hours.

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated using a held-out subset of the Moroccan Wikipedia QA dataset.

Summary

The model performed well on the evaluation dataset, demonstrating its ability to generate accurate and contextually relevant answers to questions in Darija.

Model Examination

The model's interpretability was examined using various techniques, including attention visualization and input perturbation.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: NVIDIA GPU

Technical Specifications

Model Architecture and Objective

The model is based on the Llama 3.2 3B architecture and was fine-tuned using the PEFT library.

Compute Infrastructure

Hardware

  • NVIDIA GPU

Software

  • PEFT 0.13.3.dev0
  • Transformers 4.25.1
  • PyTorch 1.12.1

Citation

BibTeX:

@misc{llama3.2-3b-Darija-Morocco-QA,
  author = {Achraf Abbaoui},
  title = {Llama 3.2 3B Fine-Tuned for Darija Moroccan QA},
  year = {2024},
  howpublished = {\url{https://huggingface.co/AchrafABBAOUI/llama3.2-3b-Darija-Morocco-QA}}
}

APA:

Abbaoui, A. (2024). Llama 3.2 3B Fine-Tuned for Darija Moroccan QA. Retrieved from https://huggingface.co/AchrafABBAOUI/llama3.2-3b-Darija-Morocco-QA

Glossary

  • Darija: A dialect of Arabic spoken in Morocco.
  • Fine-tuning: The process of training a pre-trained model on a specific dataset to improve its performance on a particular task.

More Information

For more information, please visit the repository.

Model Card Authors

Achraf Abbaoui

Model Card Contact

For any questions or issues, please contact Achraf Abbaoui at [[email protected]].

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AchrafABBAOUI/llama3.2-3b-Darija-Morocco-QA

Adapter
(499)
this model