mshojaei77 commited on
Commit
4519270
·
verified ·
1 Parent(s): 03a0603

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -1,6 +1,5 @@
1
  ---
2
  base_model:
3
- - mshojaei77/gemma-2-2b-fa-v3
4
  - mshojaei77/Gemma-2-2b-fa
5
  library_name: transformers
6
  tags:
@@ -17,14 +16,13 @@ language:
17
 
18
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6556b1bb85d43542fa1a8f91/NOSivA-mvcG_DhVkJQY3p.png)
19
 
20
- This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **selective self-distillation** and **self-merging** to enhance its performance and efficiency for Persian language conversational tasks.
21
 
22
  This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`.
23
 
24
  ## Key Improvements in v2
25
 
26
  * **Optimized Performance:** This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations.
27
- * **Selective Self-Distillation:** Experimental self-distillation methods have been applied to refine the model's knowledge and generation quality.
28
  * **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation.
29
 
30
  **Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project. **Output quality may vary, and critical evaluation is still necessary.**
@@ -67,7 +65,7 @@ print(assistant_response)
67
 
68
  * **Base Model:** google/gemma-2-2b-it
69
  * **Previous Version:** mshojaei77/Gemma-2b-fa
70
- * **Optimization Techniques:** Selective Self-Distillation, Self-Merging, Further Optimization.
71
  * **Architecture:** Gemma2ForCausalLM (same as base model)
72
  * **Model Size:** 2 billion parameters
73
 
@@ -75,7 +73,7 @@ print(assistant_response)
75
 
76
  This model is intended for:
77
 
78
- * **Research and Experimentation:** Exploring the impact of Self-Distillation and Self-Merging techniques on Persian Gemma models.
79
  * **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods.
80
  * **Community Development:** Contributing to the growing ecosystem of Persian language models.
81
  * **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature.
@@ -90,3 +88,7 @@ While optimized, this model still has limitations:
90
  * **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases.
91
 
92
  **Use this model responsibly and be aware of its experimental nature.**
 
 
 
 
 
1
  ---
2
  base_model:
 
3
  - mshojaei77/Gemma-2-2b-fa
4
  library_name: transformers
5
  tags:
 
16
 
17
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6556b1bb85d43542fa1a8f91/NOSivA-mvcG_DhVkJQY3p.png)
18
 
19
+ This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **self-merging** to enhance its performance and efficiency for Persian language conversational tasks.
20
 
21
  This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`.
22
 
23
  ## Key Improvements in v2
24
 
25
  * **Optimized Performance:** This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations.
 
26
  * **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation.
27
 
28
  **Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project. **Output quality may vary, and critical evaluation is still necessary.**
 
65
 
66
  * **Base Model:** google/gemma-2-2b-it
67
  * **Previous Version:** mshojaei77/Gemma-2b-fa
68
+ * **Optimization Techniques:** Self-Merging, Further Optimization.
69
  * **Architecture:** Gemma2ForCausalLM (same as base model)
70
  * **Model Size:** 2 billion parameters
71
 
 
73
 
74
  This model is intended for:
75
 
76
+ * **Research and Experimentation:** Exploring the impact of Self-Merging techniques on Persian Gemma models.
77
  * **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods.
78
  * **Community Development:** Contributing to the growing ecosystem of Persian language models.
79
  * **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature.
 
88
  * **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases.
89
 
90
  **Use this model responsibly and be aware of its experimental nature.**
91
+
92
+ **Citation:**
93
+
94
+ If you use this model in your research or applications, please cite it using the following DOI: [10.57967/hf/4772](https://doi.org/10.57967/hf/4772)