Update README.md
Browse files
README.md
CHANGED
|
@@ -42,7 +42,7 @@ model-index:
|
|
| 42 |
|
| 43 |
# MarianMT Eastern Syriac Vocalization Model
|
| 44 |
|
| 45 |
-
A fine-tuned MarianMT model for automatic Eastern Syriac (Mossul
|
| 46 |
|
| 47 |
## Model Description
|
| 48 |
|
|
@@ -51,9 +51,9 @@ This model is fine-tuned from [`Helsinki-NLP/opus-mt-tc-bible-big-sem-en`](https
|
|
| 51 |
### Key Features
|
| 52 |
|
| 53 |
- **Single-direction model**: Converts consonantal Syriac (`>>syr_cons<<`) to vocalized Eastern Syriac (`>>syr_voc<<`)
|
| 54 |
-
- **Eastern Syriac optimized**: Trained specifically on Eastern Syriac texts (Mossul
|
| 55 |
- **High performance**: Achieves 62.41 BLEU, 87.98 chrF, and 58.81% character accuracy on test set
|
| 56 |
-
- **Biblical and corpus text optimized**: Trained on Eastern Syriac Bible texts (Mossul) and Digital Syriac Corpus texts
|
| 57 |
|
| 58 |
## Model Details
|
| 59 |
|
|
@@ -181,7 +181,7 @@ Recommended generation parameters:
|
|
| 181 |
|
| 182 |
## Limitations and Bias
|
| 183 |
|
| 184 |
-
- **Dialect Specificity**: This model is trained specifically on Eastern Syriac (Mossul
|
| 185 |
- **Domain Specificity**: This model is trained primarily on biblical and corpus Syriac texts. Performance may vary on other domains (e.g., modern Syriac, poetry, prose).
|
| 186 |
- **Single Direction**: The model only vocalizes consonantal text. It does not perform the reverse operation (removing vocalization).
|
| 187 |
- **Length Constraints**: Maximum input/output length is 512 tokens. Longer texts should be split into smaller segments.
|
|
|
|
| 42 |
|
| 43 |
# MarianMT Eastern Syriac Vocalization Model
|
| 44 |
|
| 45 |
+
A fine-tuned MarianMT model for automatic Eastern Syriac (Mossul Bible) vocalization, converting consonantal (unvocalized) Syriac text to fully vocalized text with diacritical marks.
|
| 46 |
|
| 47 |
## Model Description
|
| 48 |
|
|
|
|
| 51 |
### Key Features
|
| 52 |
|
| 53 |
- **Single-direction model**: Converts consonantal Syriac (`>>syr_cons<<`) to vocalized Eastern Syriac (`>>syr_voc<<`)
|
| 54 |
+
- **Eastern Syriac optimized**: Trained specifically on Eastern Syriac texts (Mossul edition) and Digital Syriac Corpus texts vocalized in Eastern Syriac
|
| 55 |
- **High performance**: Achieves 62.41 BLEU, 87.98 chrF, and 58.81% character accuracy on test set
|
| 56 |
+
- **Biblical and corpus text optimized**: Trained on Eastern Syriac Bible texts (Mossul edition) and Digital Syriac Corpus texts
|
| 57 |
|
| 58 |
## Model Details
|
| 59 |
|
|
|
|
| 181 |
|
| 182 |
## Limitations and Bias
|
| 183 |
|
| 184 |
+
- **Dialect Specificity**: This model is trained specifically on Eastern Syriac (Mossul edition). Performance may vary on Western Syriac or other Syriac dialects.
|
| 185 |
- **Domain Specificity**: This model is trained primarily on biblical and corpus Syriac texts. Performance may vary on other domains (e.g., modern Syriac, poetry, prose).
|
| 186 |
- **Single Direction**: The model only vocalizes consonantal text. It does not perform the reverse operation (removing vocalization).
|
| 187 |
- **Length Constraints**: Maximum input/output length is 512 tokens. Longer texts should be split into smaller segments.
|