RooseBERT-scr-cased
This model is a fine-tuned version of bert-base-cased.
It achieves the following results on the evaluation set:
- Loss: 0.9785
- Accuracy: 0.7668
- Perplexity 2.789
Model description
This model builds on the same architecture as bert-base-cased, leveraging transformer-based contextual embeddings to better understand the nuances of political language.
Intended Use Cases
Suitable Applications
- Political discourse analysis: Identifying patterns, sentiments, and rhetoric in debates.
- Contextual word interpretation: Understanding the meaning of words within political contexts.
- Sentiment classification: Differentiating positive, neutral, and negative sentiments in political speech.
- Text generation improvement: Enhancing auto-completions and summaries in politically focused language models.
Limitations
- Bias Sensitivity: Since it was trained on political debates, inherent biases in the data may be reflected in the model’s outputs.
- Not Suitable for General-Purpose NLP: Its optimization is specific for political contexts.
- Does Not Perform Fact-Checking: The model does not verify factual accuracy.
Training and Evaluation Data
The model was trained on a curated dataset of political debates sourced from:
- Parliamentary transcripts
- Presidential debates and public speeches
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 4
- total_train_batch_size: 2048
- total_eval_batch_size: 512
- optimizer: Use adamw_torch with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- training_steps: 250000
- mixed_precision_training: Native AMP
Training results
| Training Loss |
Epoch |
Step |
Accuracy |
Validation Loss |
| No log |
0 |
0 |
0.0000 |
10.3984 |
| 1.4166 |
12.6967 |
50000 |
0.6993 |
1.3574 |
| 1.2695 |
25.3936 |
100000 |
0.7231 |
1.2178 |
| 1.2069 |
38.0904 |
150000 |
0.7336 |
1.1592 |
| 1.1727 |
50.7871 |
200000 |
0.7386 |
1.1309 |
| 1.012 |
274.6656 |
250000 |
0.9785 |
0.7670 |
Framework versions
- Transformers 4.49.0.dev0
- Pytorch 2.5.1
- Datasets 3.2.0
- Tokenizers 0.21.0
Citation
If you use this model, cite us:
@misc{
dore2025roosebertnewdealpolitical,
title={RooseBERT: A New Deal For Political Language Modelling},
author={Deborah Dore and Elena Cabrio and Serena Villata},
year={2025},
eprint={2508.03250},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.03250},
}