--- library_name: transformers license: mit datasets: - rajpurkar/squad_v2 language: - en base_model: - answerdotai/ModernBERT-base pipeline_tag: question-answering --- # ModernBERT-base-squad2 ModernBERT fine-tuned for extractive question answering tasks. Use this model to extract specific spans of text within a given context that directly answer questions. - Base Model: `answerdotai/ModernBERT-base` - Fine-tuned on: SQuAD 2.0 dataset - Use: Extractive question answering --- # Usage ```python from transformers import AutoTokenizer, AutoModel import torch.nn.functional as F import torch def predict_answers(batch, model, tokenizer, device): inputs = tokenizer( [item["question"] for item in batch], [item["context"] for item in batch], return_tensors="pt", max_length=512, truncation=True, padding="max_length", ).to(device) with torch.no_grad(): outputs = model(**inputs) start_probs = F.softmax(outputs.start_logits, dim=-1) end_probs = F.softmax(outputs.end_logits, dim=-1) start_indices = torch.argmax(start_probs, dim=-1) end_indices = torch.argmax(end_probs, dim=-1) return [ ( tokenizer.decode(inputs["input_ids"][i][start:end + 1], skip_special_tokens=True), (start_probs[i, start] * end_probs[i, end]).item(), ) for i, (start, end) in enumerate(zip(start_indices, end_indices)) ] model_id = "smangla/ModernBERT-base-squad2" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModel.from_pretrained(model_id, trust_remote_code=True) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) batch = [ {"question": "What is the capital of France?", "context": "Paris is the capital of France."}, {"question": "Who wrote Hamlet?", "context": "William Shakespeare wrote the play Hamlet."}, ] results = predict_answers(batch, model, tokenizer, device) for i, (answer, prob) in enumerate(results): print(f"Question {i + 1}: {batch[i]['question']}") print(f"Answer: {answer}") print(f"Probability: {prob:.4f}") ``` Output: ``` Question 1: What is the capital of France? Answer: Paris Probability: 0.9929 Question 2: Who wrote Hamlet? Answer: William Shakespeare Probability: 0.9995 ``` --- # Metrics Evaluation results using the official evaluation script on SQuAD 2.0 dev set: ```json { "exact": 80.29141750189505, "f1": 83.22890970115323, "total": 11873, "HasAns_exact": 72.08164642375169, "HasAns_f1": 77.96505480462089, "HasAns_total": 5928, "NoAns_exact": 88.47771236333053, "NoAns_f1": 88.47771236333053, "NoAns_total": 5945 } ``` --- # Limitations - The model solely extracts answers from the input context to generate answers and does not use external knowledge. --- # Training Details - **Dataset:** SQuAD 2.0 ([https://huggingface.co/datasets/rajpurkar/squad_v2](https://huggingface.co/datasets/rajpurkar/squad_v2)) - **Epochs:** 4 - **Batch Size:** 32 - **Scheduler**: Linear - **Learning Rate:** 5e-5 - **Weight decay:** 0.01 - **Warmup ratio:** 0.6 ---