Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -6,7 +6,7 @@ language:
|
|
| 6 |
base_model: google/gemma-2-9b
|
| 7 |
library_name: transformers
|
| 8 |
---
|
| 9 |
-
# Gemma2 9B for Telugu: 50 target vocabulary size + Align target vocabulary initialization +
|
| 10 |
|
| 11 |
This model is built on top of Gemma2 9B adapted for Telugu using 30K target language sentences sampled from CC-100.
|
| 12 |
|
|
@@ -14,7 +14,7 @@ This model is built on top of Gemma2 9B adapted for Telugu using 30K target lang
|
|
| 14 |
|
| 15 |
* **Vocabulary**: This model has an additional 50 target vocabulary.
|
| 16 |
* **Target vocabulary initialization**: The target weights of the embedding were initialized using Align initialization.
|
| 17 |
-
* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the
|
| 18 |
|
| 19 |
## Model Description
|
| 20 |
|
|
|
|
| 6 |
base_model: google/gemma-2-9b
|
| 7 |
library_name: transformers
|
| 8 |
---
|
| 9 |
+
# Gemma2 9B for Telugu: 50 target vocabulary size + Align target vocabulary initialization + 2x2LS/MTP/512 training
|
| 10 |
|
| 11 |
This model is built on top of Gemma2 9B adapted for Telugu using 30K target language sentences sampled from CC-100.
|
| 12 |
|
|
|
|
| 14 |
|
| 15 |
* **Vocabulary**: This model has an additional 50 target vocabulary.
|
| 16 |
* **Target vocabulary initialization**: The target weights of the embedding were initialized using Align initialization.
|
| 17 |
+
* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the 2x2LS/MTP/512 strategies introduced in the paper.
|
| 18 |
|
| 19 |
## Model Description
|
| 20 |
|