Llama-3.1-8B-DeepSeek-Distilled
Model Overview
This model is distilled from deepseek-ai/deepseek-llm-67b-base, a 67B parameter model available on Hugging Face. It is based on the meta-llama/Llama-3.1-8B architecture.
Evaluation Scores
- XTREME - XNLI - en: Accuracy: 0.620
- SuperGLUE - BoolQ: Accuracy: 0.848
- GLUE - SST-2: Accuracy: 0.938
- SQuAD:
- Exact Match: 71.8
- F1 Score: 84.7
Usage
Load the model and tokenizer from the Hugging Face Hub:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("enesarda22/Llama-3.1-8B-DeepSeek67B-Distilled")
tokenizer = AutoTokenizer.from_pretrained("enesarda22/Llama-3.1-8B-DeepSeek67B-Distilled")
- Downloads last month
- 1