|
|
--- |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- sentence-transformers |
|
|
- cross-encoder |
|
|
- reranker |
|
|
- generated_from_trainer |
|
|
- dataset_size:1024986 |
|
|
- loss:CrossEntropyLoss |
|
|
- modernbert |
|
|
- mnli |
|
|
- snli |
|
|
- anli |
|
|
base_model: jhu-clsp/ettin-encoder-68m |
|
|
datasets: |
|
|
- dleemiller/FineCat-NLI |
|
|
pipeline_tag: text-classification |
|
|
library_name: sentence-transformers |
|
|
metrics: |
|
|
- f1_macro |
|
|
- f1_micro |
|
|
- f1_weighted |
|
|
model-index: |
|
|
- name: CrossEncoder based on jhu-clsp/ettin-encoder-68m |
|
|
results: |
|
|
- task: |
|
|
type: cross-encoder-classification |
|
|
name: Cross Encoder Classification |
|
|
dataset: |
|
|
name: FineCat dev |
|
|
type: FineCat-dev |
|
|
metrics: |
|
|
- type: f1_macro |
|
|
value: 0.8213 |
|
|
name: F1 Macro |
|
|
- type: f1_micro |
|
|
value: 0.8229 |
|
|
name: F1 Micro |
|
|
- type: f1_weighted |
|
|
value: 0.8226 |
|
|
name: F1 Weighted |
|
|
--- |
|
|
|
|
|
# FineCat-NLI Small |
|
|
|
|
|
<p align="center"> |
|
|
<img src="/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F65ff92ea467d83751a727538%2FJzq_CZCyRYGrVgbto3eRr.png%26quot%3B%3C%2Fspan%3E style="width: 400px;"> |
|
|
</p> |
|
|
|
|
|
----- |
|
|
|
|
|
# Overview |
|
|
|
|
|
This model is a fine-tune of `jhu-clsp/ettin-encoder-68m`, |
|
|
trained on the `dleemiller/FineCat-NLI` dataset—a compilation of several high-quality |
|
|
NLI data sources with quality screening and reduction of easy samples in the training split. |
|
|
The training also incorporates logit distillation from `dleemiller/finecat-nli-l`. |
|
|
|
|
|
Distillation loss looks like this: |
|
|
$$ |
|
|
\begin{equation} |
|
|
\mathcal{L} = \alpha \cdot \mathcal{L}_{\text{CE}}(z^{(s)}, y) + \beta \cdot \mathcal{L}_{\text{MSE}}(z^{(s)}, z^{(t)}) |
|
|
\end{equation} |
|
|
$$ |
|
|
|
|
|
where \\(z^{(s)}\\) and \\(z^{(t)}\\) are the student and teacher logits, \\(y\\) are the ground truth labels, |
|
|
and \\(\alpha\\) and \\(\beta\\) are equally weighted at 0.5. |
|
|
|
|
|
This model and dataset specifically targets improving NLI, through high quality sources. The tasksource models |
|
|
are the best checkpoints to start from, although training from ModernBERT is also competitive. |
|
|
|
|
|
----- |
|
|
|
|
|
# NLI Evaluation Results |
|
|
|
|
|
F1-Micro scores (equivalent to accuracy) for each dataset. |
|
|
Performance was measured at bs=32 using a Nvidia Blackwell PRO 6000 Max-Q. |
|
|
|
|
|
| Model | finecat | mnli | mnli_mismatched | snli | anli_r1 | anli_r2 | anli_r3 | wanli | lingnli | Throughput (samples/s) | Peak GPU Mem (MB) | |
|
|
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | |
|
|
| `dleemiller/finecat-nli-s` | **0.7834** | 0.8725 | 0.8725 | 0.8973 | **0.6400** | **0.4660** | **0.4617** | **0.7284** | <u>0.8072</u> | 2291.87 | 415.65 | |
|
|
| `tasksource/deberta-small-long-nli` | 0.7492 | 0.8194 | 0.8206 | 0.8613 | <u>0.5670</u> | <u>0.4220</u> | <u>0.4475</u> | <u>0.7034</u> | 0.7605 | 2250.66 | 1351.08 | |
|
|
| `cross-encoder/nli-deberta-v3-xsmall` | 0.7269 | **0.8781** | <u>0.8777</u> | **0.9164** | 0.3620 | 0.3030 | 0.3183 | 0.6096 | **0.8122** | 2510.05 | 753.91 | |
|
|
| `dleemiller/EttinX-nli-s` | 0.7251 | <u>0.8765</u> | **0.8798** | 0.9128 | 0.3360 | 0.2790 | 0.3083 | 0.6234 | 0.8012 | 2348.21 | 415.65 | |
|
|
| `cross-encoder/nli-MiniLM2-L6-H768` | 0.7119 | 0.8660 | 0.8683 | <u>0.9137</u> | 0.3090 | 0.2850 | 0.2867 | 0.5830 | 0.7905 | 2885.72 | 566.64 | |
|
|
| `cross-encoder/nli-distilroberta-base` | 0.6936 | 0.8365 | 0.8398 | 0.8996 | 0.2660 | 0.2810 | 0.2975 | 0.5516 | 0.7516 | 2838.17 | 566.64 | |
|
|
|
|
|
----- |
|
|
|
|
|
# Usage |
|
|
|
|
|
### Label Map: |
|
|
- `entailment`: 0 |
|
|
- `neutral`: 1 |
|
|
- `contradiction`: 2 |
|
|
|
|
|
|
|
|
## Direct Usage (Sentence Transformers) |
|
|
|
|
|
First install the Sentence Transformers library: |
|
|
|
|
|
```bash |
|
|
pip install -U sentence-transformers |
|
|
``` |
|
|
|
|
|
Then you can load the model and run inference. |
|
|
|
|
|
```python |
|
|
from sentence_transformers import CrossEncoder |
|
|
import numpy as np |
|
|
|
|
|
model = CrossEncoder("dleemiller/finecat-nli-s") |
|
|
id2label = model.model.config.id2label # {0:'entailment', 1:'neutral', 2:'contradiction'} |
|
|
|
|
|
pairs = [ |
|
|
("The glass fell off the counter and shattered on the tile.", |
|
|
"The glass broke when it hit the floor."), # E |
|
|
("The store opens at 9 a.m. every day.", |
|
|
"The store opens at 7 a.m. on weekdays."), # C |
|
|
("A researcher presented results at the conference.", |
|
|
"The presentation won the best paper award."), # N |
|
|
("It started raining heavily, so the match was postponed.", |
|
|
"The game was delayed due to weather."), # E |
|
|
("Every seat on the flight was taken.", |
|
|
"There were several empty seats on the plane."), # C |
|
|
] |
|
|
|
|
|
logits = model.predict(pairs) # shape: (5, 3) |
|
|
|
|
|
for (prem, hyp), row in zip(pairs, logits): |
|
|
pred_idx = int(np.argmax(row)) |
|
|
pred = id2label[pred_idx] |
|
|
print(f"[{pred}] Premise: {prem} | Hypothesis: {hyp}") |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
We thank the creators and contributors of `tasksource` and `MoritzLaurer` for making their work available. |
|
|
This model would not be possible without their efforts and open source contributions. |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{nli-compiled-2025, |
|
|
title = {FineCat NLI Dataset}, |
|
|
author = {Lee Miller}, |
|
|
year = {2025}, |
|
|
howpublished = {Refined compilation of 6 major NLI datasets} |
|
|
} |
|
|
``` |