Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -12,7 +12,12 @@ widget:
12
 
13
  # Model Card for Mistral-7B-Instruct-v0.2
14
 
15
- The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
 
 
 
 
 
16
 
17
  For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/la-plateforme/).
18
 
@@ -53,12 +58,6 @@ decoded = tokenizer.batch_decode(generated_ids)
53
  print(decoded[0])
54
  ```
55
 
56
- ## Model Architecture
57
- This instruction model is based on Mistral-7B-v0.1, a transformer model with the following architecture choices:
58
- - Grouped-Query Attention
59
- - Sliding-Window Attention
60
- - Byte-fallback BPE tokenizer
61
-
62
  ## Troubleshooting
63
  - If you see the following error:
64
  ```
 
12
 
13
  # Model Card for Mistral-7B-Instruct-v0.2
14
 
15
+ The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.
16
+
17
+ Mistral-7B-v0.2 has the following changes compared to Mistral-7B-v0.1
18
+ - 32k context window (vs 8k context in v0.1)
19
+ - Rope-theta = 1e6
20
+ - No Sliding-Window Attention
21
 
22
  For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/la-plateforme/).
23
 
 
58
  print(decoded[0])
59
  ```
60
 
 
 
 
 
 
 
61
  ## Troubleshooting
62
  - If you see the following error:
63
  ```