File size: 3,114 Bytes
bf2def1 8fbe23a bf2def1 76bacf2 bf2def1 9e9a447 bf2def1 76bacf2 bf2def1 76bacf2 bf2def1 1971d0f 76bacf2 5975f60 45795a9 b085fda 45795a9 76bacf2 9882043 45795a9 bf2def1 45795a9 76bacf2 bf2def1 6cd98f4 76bacf2 bf2def1 1971d0f e74a2a5 7f8fb46 76bacf2 bf2def1 1971d0f 76bacf2 bf2def1 76bacf2 bf2def1 1971d0f 76bacf2 d449f40 76bacf2 d449f40 bf2def1 76bacf2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
---
library_name: transformers
license: mit
datasets:
- rajpurkar/squad_v2
language:
- en
base_model:
- answerdotai/ModernBERT-base
pipeline_tag: question-answering
---
# ModernBERT-base-squad2
ModernBERT fine-tuned for extractive question answering tasks. Use this model to extract specific spans of text within a given context that directly answer questions.
- Base Model: `answerdotai/ModernBERT-base`
- Fine-tuned on: SQuAD 2.0 dataset
- Use: Extractive question answering
---
# Usage
```python
from transformers import AutoTokenizer, AutoModel
import torch.nn.functional as F
import torch
def predict_answers(batch, model, tokenizer, device):
inputs = tokenizer(
[item["question"] for item in batch],
[item["context"] for item in batch],
return_tensors="pt",
max_length=512,
truncation=True,
padding="max_length",
).to(device)
with torch.no_grad():
outputs = model(**inputs)
start_probs = F.softmax(outputs.start_logits, dim=-1)
end_probs = F.softmax(outputs.end_logits, dim=-1)
start_indices = torch.argmax(start_probs, dim=-1)
end_indices = torch.argmax(end_probs, dim=-1)
return [
(
tokenizer.decode(inputs["input_ids"][i][start:end + 1], skip_special_tokens=True),
(start_probs[i, start] * end_probs[i, end]).item(),
)
for i, (start, end) in enumerate(zip(start_indices, end_indices))
]
model_id = "smangla/ModernBERT-base-squad2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModel.from_pretrained(model_id, trust_remote_code=True)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
batch = [
{"question": "What is the capital of France?", "context": "Paris is the capital of France."},
{"question": "Who wrote Hamlet?", "context": "William Shakespeare wrote the play Hamlet."},
]
results = predict_answers(batch, model, tokenizer, device)
for i, (answer, prob) in enumerate(results):
print(f"Question {i + 1}: {batch[i]['question']}")
print(f"Answer: {answer}")
print(f"Probability: {prob:.4f}")
```
Output:
```
Question 1: What is the capital of France?
Answer: Paris
Probability: 0.9929
Question 2: Who wrote Hamlet?
Answer: William Shakespeare
Probability: 0.9995
```
---
# Metrics
Evaluation results using the official evaluation script on SQuAD 2.0 dev set:
```json
{
"exact": 80.29141750189505,
"f1": 83.22890970115323,
"total": 11873,
"HasAns_exact": 72.08164642375169,
"HasAns_f1": 77.96505480462089,
"HasAns_total": 5928,
"NoAns_exact": 88.47771236333053,
"NoAns_f1": 88.47771236333053,
"NoAns_total": 5945
}
```
---
# Limitations
- The model solely extracts answers from the input context to generate answers and does not use external knowledge.
---
# Training Details
- **Dataset:** SQuAD 2.0 ([https://huggingface.co/datasets/rajpurkar/squad_v2](https://huggingface.co/datasets/rajpurkar/squad_v2))
- **Epochs:** 4
- **Batch Size:** 32
- **Scheduler**: Linear
- **Learning Rate:** 5e-5
- **Weight decay:** 0.01
- **Warmup ratio:** 0.6
--- |