Spestly
/

Athena-3-7B

Text Generation

text-generation-inference

Model card Files Files and versions

Spestly commited on Apr 6

Commit

beb535d

·

verified ·

1 Parent(s): 77ff9d8

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -15,8 +15,8 @@ library_name: transformers
 - **Model Developer:** Aayan Mishra
 - **Model Type:** Causal Language Model
 - **Architecture:** Transformer with Rotary Position Embeddings (RoPE), SwiGLU activation, RMSNorm, and Attention QKV bias
-- **Parameters:** 14.7 billion total (13.1 billion non-embedding)
-- **Layers:** 48
 - **Attention Heads:** 28 for query and 4 for key-value (Grouped Query Attention)
 - **Vocabulary Size:** Approximately 151,646 tokens
 - **Context Length:** Supports up to 131,072 tokens

 - **Model Developer:** Aayan Mishra
 - **Model Type:** Causal Language Model
 - **Architecture:** Transformer with Rotary Position Embeddings (RoPE), SwiGLU activation, RMSNorm, and Attention QKV bias
+- **Parameters:** 14.7 billion total (13.1 billion non-embedding)
+- **Layers:** 28
 - **Attention Heads:** 28 for query and 4 for key-value (Grouped Query Attention)
 - **Vocabulary Size:** Approximately 151,646 tokens
 - **Context Length:** Supports up to 131,072 tokens