eevvgg
/

bert-polish-sentiment-politics

Text Classification

Model card Files Files and versions

bert-polish-sentiment-politics / README.md

eevvgg's picture

add cite

819cccd verified over 1 year ago

|

history blame contribute delete

2.82 kB

	---
	language:
	- pl

	pipeline_tag: text-classification

	widget:
	- text: "Przykro patrzeć, a słuchać się nie da."
	example_title: "example 1"
	- text: "Oczywiście ze Pan Prezydent to nasza duma narodowa!!"
	example_title: "example 2"

	tags:
	- text
	- sentiment
	- politics

	metrics:
	- accuracy
	- f1

	model-index:
	- name: PaReS-sentimenTw-political-PL
	results:
	- task:
	type: sentiment-classification # Required. Example: automatic-speech-recognition
	name: Text Classification # Optional. Example: Speech Recognition
	dataset:
	type: tweets # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
	name: tweets_2020_electionsPL # Required. A pretty name for the dataset. Example: Common Voice (French)
	metrics:
	- type: f1 # Required. Example: wer. Use metric id from https://hf.co/metrics
	value: 94.4 # Required. Example: 20.90

	---

	# PaReS-sentimenTw-political-PL

	This model is a fine-tuned version of [dkleczek/bert-base-polish-cased-v1](https://huggingface.co/dkleczek/bert-base-polish-cased-v1) to predict 3-categorical sentiment.
	Fine-tuned on 1k sample of manually annotated Twitter data.

	Model developed as a part of ComPathos project: https://www.ncn.gov.pl/sites/default/files/listy-rankingowe/2020-09-30apsv2/streszczenia/497124-en.pdf

	```
	from transformers import pipeline

	model_path = "eevvgg/PaReS-sentimenTw-political-PL"
	sentiment_task = pipeline(task = "sentiment-analysis", model = model_path, tokenizer = model_path)

	sequence = ["Cała ta śmieszna debata była próbą ukrycia problemów gospodarczych jakie są i nadejdą, pytania w większości o mało istotnych sprawach",
	"Brawo panie ministrze!"]

	result = sentiment_task(sequence)
	labels = [i['label'] for i in result] # ['Negative', 'Positive']

	```


	## Model Sources
	- BibTex citation:
	```
	@misc{SentimenTwPLGK2023,
	author={Gajewska, Ewelina and Konat, Barbara},
	title={PaReSTw: BERT for Sentiment Detection in Polish Language},
	year={2023},
	howpublished = {\url{https://huggingface.co/eevvgg/PaReS-sentimenTw-political-PL}},
	}
	```




	## Intended uses & limitations

	Sentiment detection in Polish data (fine-tuned on tweets from political domain).


	## Training and evaluation data

	- Trained for 3 epochs, mini-batch size of 8.
	- Training results: loss: 0.1358926964368792



	It achieves the following results on the test set (10%):

	- No. examples = 100
	- mini batch size = 8
	- accuracy = 0.950
	- macro f1 = 0.944

	precision recall f1-score support

	0 0.960 0.980 0.970 49
	1 0.958 0.885 0.920 26
	2 0.923 0.960 0.941 25