it-no-bio-20251014-t14 / README.md

Upload folder using huggingface_hub

f243687 verified 2 months ago

4.64 kB

	---
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-classification
	language:
	- it
	tags:
	- transformers
	- xlm-roberta
	- multilingual
	- social-media
	- text-classification
	---
	# it-no-bio-20251014-t14

	Slur reclamation binary classifier
	Task: LGBTQ+ reclamation vs non-reclamation use of harmful words on social media text.

	> Trial timestamp (UTC): 2025-10-14 10:43:41
	>
	> Data case: `it`

	## Configuration (trial hyperparameters)

	Model: Alibaba-NLP/gte-multilingual-base

	\| Hyperparameter \| Value \|
	\|---\|---\|
	\| LANGUAGES \| it \|
	\| LR \| 3e-05 \|
	\| EPOCHS \| 3 \|
	\| MAX_LENGTH \| 256 \|
	\| USE_BIO \| False \|
	\| USE_LANG_TOKEN \| False \|
	\| GATED_BIO \| False \|
	\| FOCAL_LOSS \| True \|
	\| FOCAL_GAMMA \| 1.5 \|
	\| USE_SAMPLER \| True \|
	\| R_DROP \| True \|
	\| R_KL_ALPHA \| 1.0 \|
	\| TEXT_NORMALIZE \| True \|

	## Dev set results (summary)

	\| Metric \| Value \|
	\|---\|---\|
	\| f1_macro_dev_0.5 \| 0.8676160051978992 \|
	\| f1_weighted_dev_0.5 \| 0.9129082823912861 \|
	\| accuracy_dev_0.5 \| 0.9079754601226994 \|
	\| f1_macro_dev_best_global \| 0.905011655011655 \|
	\| f1_weighted_dev_best_global \| 0.9400374676448295 \|
	\| accuracy_dev_best_global \| 0.9386503067484663 \|
	\| f1_macro_dev_best_by_lang \| 0.905011655011655 \|
	\| f1_weighted_dev_best_by_lang \| 0.9400374676448295 \|
	\| accuracy_dev_best_by_lang \| 0.9386503067484663 \|
	\| default_threshold \| 0.5 \|
	\| best_threshold_global \| 0.7000000000000001 \|
	\| thresholds_by_lang \| {"it": 0.7000000000000001} \|

	### Thresholds
	- Default: `0.5`
	- Best global: `0.7000000000000001`
	- Best by language: `{
	"it": 0.7000000000000001
	}`

	## Detailed evaluation

	### Classification report @ 0.5
	```text
	precision recall f1-score support

	no-recl (0) 0.9835 0.9015 0.9407 132
	recl (1) 0.6905 0.9355 0.7945 31

	accuracy 0.9080 163
	macro avg 0.8370 0.9185 0.8676 163
	weighted avg 0.9277 0.9080 0.9129 163
	```

	### Classification report @ best global threshold (t=0.70)
	```text
	precision recall f1-score support

	no-recl (0) 0.9766 0.9470 0.9615 132
	recl (1) 0.8000 0.9032 0.8485 31

	accuracy 0.9387 163
	macro avg 0.8883 0.9251 0.9050 163
	weighted avg 0.9430 0.9387 0.9400 163
	```

	### Classification report @ best per-language thresholds
	```text
	precision recall f1-score support

	no-recl (0) 0.9766 0.9470 0.9615 132
	recl (1) 0.8000 0.9032 0.8485 31

	accuracy 0.9387 163
	macro avg 0.8883 0.9251 0.9050 163
	weighted avg 0.9430 0.9387 0.9400 163
	```


	## Per-language metrics (at best-by-lang)

	\| lang \| n \| acc \| f1_macro \| f1_weighted \| prec_macro \| rec_macro \| prec_weighted \| rec_weighted \|
	\|---\|---:\|---:\|---:\|---:\|---:\|---:\|---:\|---:\|
	\| it \| 163 \| 0.9387 \| 0.9050 \| 0.9400 \| 0.8883 \| 0.9251 \| 0.9430 \| 0.9387 \|


	## Data
	- Train/Dev: private multilingual splits with ~15% stratified Dev (by (lang,label)).
	- Source: merged EN/IT/ES data with bios retained (ignored if unused by model).

	## Usage
	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig
	import torch, numpy as np

	repo = "SimoneAstarita/it-no-bio-20251014-t14"
	tok = AutoTokenizer.from_pretrained(repo)
	cfg = AutoConfig.from_pretrained(repo)
	model = AutoModelForSequenceClassification.from_pretrained(repo)

	texts = ["example text ..."]
	langs = ["en"]

	mode = "best_global" # or "0.5", "by_lang"

	enc = tok(texts, truncation=True, padding=True, max_length=256, return_tensors="pt")
	with torch.no_grad():
	logits = model(**enc).logits
	probs = torch.softmax(logits, dim=-1)[:, 1].cpu().numpy()

	if mode == "0.5":
	th = 0.5
	preds = (probs >= th).astype(int)
	elif mode == "best_global":
	th = getattr(cfg, "best_threshold_global", 0.5)
	preds = (probs >= th).astype(int)
	elif mode == "by_lang":
	th_by_lang = getattr(cfg, "thresholds_by_lang", {})
	preds = np.zeros_like(probs, dtype=int)
	for lg in np.unique(langs):
	t = th_by_lang.get(lg, getattr(cfg, "best_threshold_global", 0.5))
	preds[np.array(langs) == lg] = (probs[np.array(langs) == lg] >= t).astype(int)
	print(list(zip(texts, preds, probs)))
	```

	### Additional files
	reports.json: all metrics (macro/weighted/accuracy) for @0.5, @best_global, and @best_by_lang.
	config.json: stores thresholds: default_threshold, best_threshold_global, thresholds_by_lang.
	postprocessing.json: duplicate threshold info for external tools.