mshojaei77
/

gemma-2-2b-fa-v2

@@ -1,6 +1,5 @@
 ---
 base_model:
-- mshojaei77/gemma-2-2b-fa-v3
 - mshojaei77/Gemma-2-2b-fa
 library_name: transformers
 tags:
@@ -17,14 +16,13 @@ language:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6556b1bb85d43542fa1a8f91/NOSivA-mvcG_DhVkJQY3p.png)
-This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **selective self-distillation** and **self-merging** to enhance its performance and efficiency for Persian language conversational tasks.
 This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`.
 ## Key Improvements in v2
 * **Optimized Performance:**  This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations.
-* **Selective Self-Distillation:**  Experimental self-distillation methods have been applied to refine the model's knowledge and generation quality.
 * **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation.
 **Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project.  **Output quality may vary, and critical evaluation is still necessary.**
@@ -67,7 +65,7 @@ print(assistant_response)
 * **Base Model:** google/gemma-2-2b-it
 * **Previous Version:** mshojaei77/Gemma-2b-fa
-* **Optimization Techniques:** Selective Self-Distillation, Self-Merging, Further Optimization.
 * **Architecture:** Gemma2ForCausalLM (same as base model)
 * **Model Size:** 2 billion parameters
@@ -75,7 +73,7 @@ print(assistant_response)
 This model is intended for:
-* **Research and Experimentation:**  Exploring the impact of  Self-Distillation and Self-Merging techniques on Persian Gemma models.
 * **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods.
 * **Community Development:**  Contributing to the growing ecosystem of Persian language models.
 * **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature.
@@ -90,3 +88,7 @@ While optimized, this model still has limitations:
 * **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases.
 **Use this model responsibly and be aware of its experimental nature.**

 ---
 base_model:
 - mshojaei77/Gemma-2-2b-fa
 library_name: transformers
 tags:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6556b1bb85d43542fa1a8f91/NOSivA-mvcG_DhVkJQY3p.png)
+This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **self-merging** to enhance its performance and efficiency for Persian language conversational tasks.
 This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`.
 ## Key Improvements in v2
 * **Optimized Performance:**  This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations.
 * **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation.
 **Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project.  **Output quality may vary, and critical evaluation is still necessary.**
 * **Base Model:** google/gemma-2-2b-it
 * **Previous Version:** mshojaei77/Gemma-2b-fa
+* **Optimization Techniques:** Self-Merging, Further Optimization.
 * **Architecture:** Gemma2ForCausalLM (same as base model)
 * **Model Size:** 2 billion parameters
 This model is intended for:
+* **Research and Experimentation:**  Exploring the impact of Self-Merging techniques on Persian Gemma models.
 * **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods.
 * **Community Development:**  Contributing to the growing ecosystem of Persian language models.
 * **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature.
 * **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases.
 **Use this model responsibly and be aware of its experimental nature.**
+**Citation:**
+If you use this model in your research or applications, please cite it using the following DOI: [10.57967/hf/4772](https://doi.org/10.57967/hf/4772)