Update README.md
Browse files
README.md
CHANGED
|
@@ -1,6 +1,5 @@
|
|
| 1 |
---
|
| 2 |
base_model:
|
| 3 |
-
- mshojaei77/gemma-2-2b-fa-v3
|
| 4 |
- mshojaei77/Gemma-2-2b-fa
|
| 5 |
library_name: transformers
|
| 6 |
tags:
|
|
@@ -17,14 +16,13 @@ language:
|
|
| 17 |
|
| 18 |

|
| 19 |
|
| 20 |
-
This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **
|
| 21 |
|
| 22 |
This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`.
|
| 23 |
|
| 24 |
## Key Improvements in v2
|
| 25 |
|
| 26 |
* **Optimized Performance:** This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations.
|
| 27 |
-
* **Selective Self-Distillation:** Experimental self-distillation methods have been applied to refine the model's knowledge and generation quality.
|
| 28 |
* **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation.
|
| 29 |
|
| 30 |
**Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project. **Output quality may vary, and critical evaluation is still necessary.**
|
|
@@ -67,7 +65,7 @@ print(assistant_response)
|
|
| 67 |
|
| 68 |
* **Base Model:** google/gemma-2-2b-it
|
| 69 |
* **Previous Version:** mshojaei77/Gemma-2b-fa
|
| 70 |
-
* **Optimization Techniques:**
|
| 71 |
* **Architecture:** Gemma2ForCausalLM (same as base model)
|
| 72 |
* **Model Size:** 2 billion parameters
|
| 73 |
|
|
@@ -75,7 +73,7 @@ print(assistant_response)
|
|
| 75 |
|
| 76 |
This model is intended for:
|
| 77 |
|
| 78 |
-
* **Research and Experimentation:** Exploring the impact of
|
| 79 |
* **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods.
|
| 80 |
* **Community Development:** Contributing to the growing ecosystem of Persian language models.
|
| 81 |
* **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature.
|
|
@@ -90,3 +88,7 @@ While optimized, this model still has limitations:
|
|
| 90 |
* **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases.
|
| 91 |
|
| 92 |
**Use this model responsibly and be aware of its experimental nature.**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
base_model:
|
|
|
|
| 3 |
- mshojaei77/Gemma-2-2b-fa
|
| 4 |
library_name: transformers
|
| 5 |
tags:
|
|
|
|
| 16 |
|
| 17 |

|
| 18 |
|
| 19 |
+
This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **self-merging** to enhance its performance and efficiency for Persian language conversational tasks.
|
| 20 |
|
| 21 |
This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`.
|
| 22 |
|
| 23 |
## Key Improvements in v2
|
| 24 |
|
| 25 |
* **Optimized Performance:** This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations.
|
|
|
|
| 26 |
* **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation.
|
| 27 |
|
| 28 |
**Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project. **Output quality may vary, and critical evaluation is still necessary.**
|
|
|
|
| 65 |
|
| 66 |
* **Base Model:** google/gemma-2-2b-it
|
| 67 |
* **Previous Version:** mshojaei77/Gemma-2b-fa
|
| 68 |
+
* **Optimization Techniques:** Self-Merging, Further Optimization.
|
| 69 |
* **Architecture:** Gemma2ForCausalLM (same as base model)
|
| 70 |
* **Model Size:** 2 billion parameters
|
| 71 |
|
|
|
|
| 73 |
|
| 74 |
This model is intended for:
|
| 75 |
|
| 76 |
+
* **Research and Experimentation:** Exploring the impact of Self-Merging techniques on Persian Gemma models.
|
| 77 |
* **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods.
|
| 78 |
* **Community Development:** Contributing to the growing ecosystem of Persian language models.
|
| 79 |
* **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature.
|
|
|
|
| 88 |
* **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases.
|
| 89 |
|
| 90 |
**Use this model responsibly and be aware of its experimental nature.**
|
| 91 |
+
|
| 92 |
+
**Citation:**
|
| 93 |
+
|
| 94 |
+
If you use this model in your research or applications, please cite it using the following DOI: [10.57967/hf/4772](https://doi.org/10.57967/hf/4772)
|