Mistral-Syndicate-7B
Model Description:
Mistral Syndicate is in no way a state-of-the-art model, rather it is a fine-tuning experiment to explore the training dynamics specific to large language models. The dataset used in finetuning was generated via a "syndicate" of other open language models both of similar parameter size and larger. Each model would generate a response for a given instruction, and the group would vote on which model's response was best.
The instruction inputs used for the output label synthesis were a curated subset of VMWare/open-instruct with additional instructions synthesized from scratch.
Prompt template
With context
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
### Input:
### Response:
Without context
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
### Response:
Evaluation Results
12.30.23
| Benchmark | Result |
|---|---|
| ARC | 60.84 |
| HellaSwag | 82.91 |
| MMLU | 60.83 |
| TruthfulQA | 43.71 |
| Winogrande | 78.61 |
| GSM8K | 44.50 |
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 61.90 |
| AI2 Reasoning Challenge (25-Shot) | 60.84 |
| HellaSwag (10-Shot) | 82.91 |
| MMLU (5-Shot) | 60.83 |
| TruthfulQA (0-shot) | 43.71 |
| Winogrande (5-shot) | 78.61 |
| GSM8k (5-shot) | 44.50 |
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 13.85 |
| IFEval (0-Shot) | 24.96 |
| BBH (3-Shot) | 20.51 |
| MATH Lvl 5 (4-Shot) | 2.42 |
| GPQA (0-shot) | 3.47 |
| MuSR (0-shot) | 13.62 |
| MMLU-PRO (5-shot) | 18.13 |
- Downloads last month
- 4