Arailym-tleubayeva
/

roberta-kaz-large-small-kazakh-corpus

Text Classification

Model card Files Files and versions

roberta-kaz-large-small-kazakh-corpus / README.md

Tleubayeva

Update README.md

81ec0a0 verified 9 months ago

|

history blame contribute delete

1.26 kB

	---
	license: apache-2.0
	datasets:
	- Arailym-aitu/small_kazakh_corpus
	language:
	- kk
	metrics:
	- accuracy
	- f1
	base_model:
	- nur-dev/roberta-kaz-large
	new_version: nur-dev/roberta-kaz-large
	pipeline_tag: text-classification
	---
	# Model Card for Model ID

	This model is designed for text classification tasks in the Kazakh language, based on the RoBERTa architecture and fine-tuned using the Small Kazakh Corpus dataset.

	## Model Details

	### Model Description
	The model aims to enhance natural language processing (NLP) capabilities for the Kazakh language, particularly in text classification tasks.

	- Developed by: Tleubayeva Arailym, Tabuldin Aisultan, Aubakirov Sultan
	- Model type: Transformer-based (RoBERTa)
	- Language(s) (NLP): Kazakh (kk)
	- License: apache-2.0

	### Results

	Evaluation results show an improvement in both accuracy and F1-score:

	Base model performance:

	Accuracy: 50.30%

	F1-score: 48.89%

	Fine-tuned model performance:

	Accuracy: 55.51% (+10%)

	F1-score: 54.83% (+5%)

	## Citation

	We will definitely add a bit later.

	## Model Card Authors
	Tleubayeva Arailym, PhD student of Astana IT University

	Tabuldin Aisultan, 3rd year student of Astana IT University

	Aubakirov Sultan, 3rd year student of Astana IT University