Spestly commited on
Commit
beb535d
·
verified ·
1 Parent(s): 77ff9d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -15,8 +15,8 @@ library_name: transformers
15
  - **Model Developer:** Aayan Mishra
16
  - **Model Type:** Causal Language Model
17
  - **Architecture:** Transformer with Rotary Position Embeddings (RoPE), SwiGLU activation, RMSNorm, and Attention QKV bias
18
- - **Parameters:** 14.7 billion total (13.1 billion non-embedding)
19
- - **Layers:** 48
20
  - **Attention Heads:** 28 for query and 4 for key-value (Grouped Query Attention)
21
  - **Vocabulary Size:** Approximately 151,646 tokens
22
  - **Context Length:** Supports up to 131,072 tokens
 
15
  - **Model Developer:** Aayan Mishra
16
  - **Model Type:** Causal Language Model
17
  - **Architecture:** Transformer with Rotary Position Embeddings (RoPE), SwiGLU activation, RMSNorm, and Attention QKV bias
18
+ - **Parameters:** 14.7 billion total (13.1 billion non-embedding)
19
+ - **Layers:** 28
20
  - **Attention Heads:** 28 for query and 4 for key-value (Grouped Query Attention)
21
  - **Vocabulary Size:** Approximately 151,646 tokens
22
  - **Context Length:** Supports up to 131,072 tokens