DeitFake: Deit-Based Deepfake Detection
Model Card for sakshamkr1/deepfake-fb-deit-vit-224
Model Description
DeitFake is a fine-tuned Vision Transformer (ViT) based on facebook/deit-base-patch16-224 for deepfake image classification. The model has been trained to classify images as 'Fake' or 'Real' using the Deepfake and Real Images dataset, which is derived from the OpenForensics Dataset.
Intended Uses
This model is designed for research and educational purposes in deepfake detection and general image integrity verification.
Possible use cases:
- Deepfake detection in research pipelines
- Media authenticity analysis
- Benchmarking transformer-based vision architectures for binary classification tasks
Not recommended for production-level forensic verification without further validation.
Training Data
The model was fine-tuned on the Deepfake and Real Images dataset (derived from OpenForensics). The dataset includes both artificially generated (fake) and real facial images.
To ensure balanced representation, random over-sampling was applied during the training phase.
Training Procedure
Fine-tuning was performed using Hugging Face’s transformers library (Trainer API):
- Base model: facebook/deit-base-patch16-224
- Epochs: 5
- Learning rate: 1e-5
- Weight decay: 0.01
- Optimizer: AdamW
- Mixed precision: fp16=True
- Framework: PyTorch (CUDA enabled)
- Loss function: CrossEntropyLoss
Evaluation Results (V2 Checkpoint)
Final performance metrics on the test set:
| Metric | Value |
|---|---|
| Test Loss | 0.0219 |
| Accuracy | 0.9922 |
| Macro F1-Score | 0.9922 |
| AUROC | 0.9997 |
| Runtime (s) | 48.26 |
| Samples/sec | 395.23 |
| Steps/sec | 6.18 |
Classification Report
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Fake | 0.9909 | 0.9936 | 0.9922 | 9521 |
| Real | 0.9936 | 0.9909 | 0.9922 | 9520 |
| Accuracy | 0.9922 | 19041 | ||
| Macro avg | 0.9922 | 0.9922 | 0.9922 | 19041 |
| Weighted avg | 0.9922 | 0.9922 | 0.9922 | 19041 |
How to Use
You can load and use this model easily with the Hugging Face Transformers library:
from transformers import AutoFeatureExtractor, AutoModelForImageClassification
from PIL import Image
import torch
# Load an image
image = Image.open("sample_image.jpg")
# Prepare inputs
inputs = extractor(images=image, return_tensors="pt")
# Run inference
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class = logits.argmax(-1).item()
labels = model.config.id2label
print(f"Predicted class: {labels[predicted_class]}")
Citation
If you use this model in your research, please cite and credit as follows:
@model{deitfake_2025,
title={DeitFake: Deit-Based Deepfake Detection},
author={Saksham Kumar},
year={2025},
publisher={Hugging Face},
doi = {10.57967/hf/6767},
model_link={https://huggingface.co/sakshamkr1/deepfake-fb-deit-vit-224}
}
Author
Developed by Saksham Kumar
LinkedIn: sakshamkr1
- Downloads last month
- 52
Model tree for sakshamkr1/deitfake-v2
Base model
facebook/deit-base-patch16-224