|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- Arailym-aitu/small_kazakh_corpus |
|
|
language: |
|
|
- kk |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
base_model: |
|
|
- nur-dev/roberta-kaz-large |
|
|
new_version: nur-dev/roberta-kaz-large |
|
|
pipeline_tag: text-classification |
|
|
--- |
|
|
# Model Card for Model ID |
|
|
|
|
|
This model is designed for text classification tasks in the Kazakh language, based on the RoBERTa architecture and fine-tuned using the Small Kazakh Corpus dataset. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
The model aims to enhance natural language processing (NLP) capabilities for the Kazakh language, particularly in text classification tasks. |
|
|
|
|
|
- **Developed by:** Tleubayeva Arailym, Tabuldin Aisultan, Aubakirov Sultan |
|
|
- **Model type:** Transformer-based (RoBERTa) |
|
|
- **Language(s) (NLP):** Kazakh (kk) |
|
|
- **License:** apache-2.0 |
|
|
|
|
|
### Results |
|
|
|
|
|
Evaluation results show an improvement in both accuracy and F1-score: |
|
|
|
|
|
Base model performance: |
|
|
|
|
|
Accuracy: 50.30% |
|
|
|
|
|
F1-score: 48.89% |
|
|
|
|
|
Fine-tuned model performance: |
|
|
|
|
|
Accuracy: 55.51% (+10%) |
|
|
|
|
|
F1-score: 54.83% (+5%) |
|
|
|
|
|
## Citation |
|
|
|
|
|
We will definitely add a bit later. |
|
|
|
|
|
## Model Card Authors |
|
|
Tleubayeva Arailym, PhD student of Astana IT University |
|
|
|
|
|
Tabuldin Aisultan, 3rd year student of Astana IT University |
|
|
|
|
|
Aubakirov Sultan, 3rd year student of Astana IT University |