Update README.md
Browse files
README.md
CHANGED
|
@@ -63,7 +63,7 @@ The model shows improvements in **instruction understanding and task completion*
|
|
| 63 |
|
| 64 |
## Technical summary
|
| 65 |
|
| 66 |
-
- Architecture: LLaMA 3.2,
|
| 67 |
- Sequence length: 4096 tokens
|
| 68 |
- Training hardware: 8× A100 80GB GPUs
|
| 69 |
- Continual pretraining corpus: 531 M Basque words (ZelaiHandi) + 300 M English tokens (FineWeb subset)
|
|
|
|
| 63 |
|
| 64 |
## Technical summary
|
| 65 |
|
| 66 |
+
- Architecture: LLaMA 3.2, 28 transformer layers, 3072 hidden size, 24 heads
|
| 67 |
- Sequence length: 4096 tokens
|
| 68 |
- Training hardware: 8× A100 80GB GPUs
|
| 69 |
- Continual pretraining corpus: 531 M Basque words (ZelaiHandi) + 300 M English tokens (FineWeb subset)
|