File size: 4,867 Bytes
f1c59c2 7769e94 f1c59c2 7769e94 f1c59c2 7769e94 f1c59c2 7769e94 f1c59c2 7769e94 f1c59c2 7769e94 f1c59c2 7769e94 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
---
language:
- en
license: mit
library_name: peft
tags:
- reranking
- information-retrieval
- pointwise
- lora
- peft
- binary-cross-entropy
base_model: meta-llama/Llama-3.1-8B
datasets:
- Tevatron/msmarco-passage
- abdoelsayed/DeAR-COT
pipeline_tag: text-classification
---
# DeAR-8B-Reranker-CE-LoRA-v1
## Model Description
**DeAR-8B-Reranker-CE-LoRA-v1** is a LoRA (Low-Rank Adaptation) adapter for neural reranking trained with Binary Cross-Entropy loss. This lightweight adapter requires only ~100MB of storage and can be applied to LLaMA-3.1-8B to achieve near full-model performance with minimal overhead.
## Model Details
- **Model Type:** LoRA Adapter for Pointwise Reranking
- **Base Model:** meta-llama/Llama-3.1-8B
- **Adapter Size:** ~100MB
- **Training Method:** LoRA with Binary Cross-Entropy + Knowledge Distillation
- **LoRA Rank:** 16
- **LoRA Alpha:** 32
- **Trainable Parameters:** 67M (0.8% of total)
## Key Features
✅ **Ultra Lightweight:** Only 100MB storage
✅ **Efficient:** 3x faster training than full fine-tuning
✅ **High Performance:** 98% of full model accuracy
✅ **Easy Integration:** Simple adapter loading
✅ **Classification-based:** Binary relevance prediction
## Usage
### Load and Use
```python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel, PeftConfig
# Load LoRA adapter
adapter_path = "abdoelsayed/dear-8b-reranker-ce-lora-v1"
config = PeftConfig.from_pretrained(adapter_path)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained(
config.base_model_name_or_path,
num_labels=1,
torch_dtype=torch.bfloat16
)
# Load and merge LoRA
model = PeftModel.from_pretrained(base_model, adapter_path)
model = model.merge_and_unload()
model.eval().cuda()
# Score query-document pair
query = "What is machine learning?"
document = "Machine learning is a subset of artificial intelligence..."
inputs = tokenizer(
f"query: {query}",
f"document: {document}",
return_tensors="pt",
truncation=True,
max_length=228,
padding="max_length"
)
inputs = {k: v.cuda() for k, v in inputs.items()}
with torch.no_grad():
score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score}")
```
### Batch Reranking
```python
@torch.inference_mode()
def rerank(tokenizer, model, query: str, documents, batch_size=64):
scores = []
device = next(model.parameters()).device
for i in range(0, len(documents), batch_size):
batch = documents[i:i + batch_size]
queries = [f"query: {query}"] * len(batch)
docs = [f"document: {title} {text}" for title, text in batch]
inputs = tokenizer(queries, docs, return_tensors="pt",
truncation=True, max_length=228, padding=True)
inputs = {k: v.to(device) for k, v in inputs.items()}
logits = model(**inputs).logits.squeeze(-1)
scores.extend(logits.cpu().tolist())
return sorted(enumerate(scores), key=lambda x: x[1], reverse=True)
```
## Training Details
### LoRA Configuration
```python
{
"r": 16,
"lora_alpha": 32,
"target_modules": ["q_proj", "v_proj", "k_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"],
"lora_dropout": 0.05,
"bias": "none",
"task_type": "SEQ_CLS"
}
```
### Training Hyperparameters
- **Learning Rate:** 1e-4
- **Batch Size:** 4
- **Gradient Accumulation:** 2
- **Epochs:** 2
- **Hardware:** 4x A100 (40GB)
- **Training Time:** ~12 hours
- **Memory:** ~28GB per GPU
## Advantages
| Feature | LoRA | Full Model |
|---------|------|------------|
| Storage | 100MB | 16GB |
| Training Time | 12h | 34h |
| Performance | 98% | 100% |
| Memory | 28GB | 38GB |
## Related Models
- [DeAR-8B-CE](https://huggingface.co/abdoelsayed/dear-8b-reranker-ce-v1) - Full model
- [DeAR-8B-RankNet-LoRA](https://huggingface.co/abdoelsayed/dear-8b-reranker-ranknet-lora-v1) - RankNet variant
- [Teacher Model](https://huggingface.co/abdoelsayed/llama2-13b-rankllama-teacher)
## Citation
```bibtex
@article{abdallah2025dear,
title={DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation},
author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Jatowt, Adam},
journal={arXiv preprint arXiv:2508.16998},
year={2025}
}
```
## License
MIT License
## More Information
- **GitHub:** [DataScienceUIBK/DeAR-Reranking](https://github.com/DataScienceUIBK/DeAR-Reranking)
- **Paper:** [arXiv:2508.16998](https://arxiv.org/abs/2508.16998)
- **Collection:** [DeAR Models](https://huggingface.co/collections/abdoelsayed/dear-reranking)
|