Austin207 commited on
Commit
c51942b
Β·
verified Β·
1 Parent(s): 445fd2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -53,11 +53,11 @@ model-index:
53
 
54
  ## Key Features
55
 
56
- - πŸš€ **Efficient Training**: Trained on RTX 5070 (8GB VRAM) in ~4 hours
57
- - πŸ“ **Extended Context**: 16,384 token context window (16x typical small models)
58
- - 🧠 **Memory Efficient**: Only 1.3GB VRAM for 1,800 tokens inference
59
- - ⚑ **Fast Inference**: ~10 tokens/second on consumer GPU
60
- - 🎯 **High Quality Data**: Trained on curated RefinedWeb subset
61
 
62
  ## Architecture Details
63
 
@@ -136,7 +136,7 @@ model-index:
136
  ## Usage
137
 
138
  ### Quick Start
139
- ---
140
  import torch
141
  from transformers import AutoTokenizer
142
  from model_neo import NeoMini, NeoMiniConfig
@@ -157,11 +157,11 @@ input_ids = tokenizer.encode(prompt, return_tensors="pt")
157
  with torch.no_grad():
158
  output = model.generate(input_ids, max_length=100, temperature=0.8)
159
  print(tokenizer.decode(output))
160
- ---
161
  ### Interactive Chat
162
- ---
163
  python interactive_chat.py
164
- ---
165
 
166
  ### Generation Parameters
167
  - **Temperature**: 0.7-0.9 for creative tasks, 0.3-0.5 for factual
@@ -204,7 +204,7 @@ python interactive_chat.py
204
  ## Environmental Impact
205
 
206
  ### Carbon Footprint
207
- - **Training Hardware**: Single RTX 5070 (200W)
208
  - **Training Time**: 4 hours
209
  - **Estimated COβ‚‚**: ~0.3 kg COβ‚‚ equivalent
210
  - **Efficiency**: 253M parameters per 0.3 kg COβ‚‚
@@ -216,7 +216,7 @@ python interactive_chat.py
216
 
217
  ## Citation
218
 
219
- ---
220
  @misc{mapneo_mini_2025,
221
  title={MAP-NEO Mini: An Efficient 253M Parameter Language Model},
222
  author={[Antony Austin]},
@@ -224,12 +224,12 @@ python interactive_chat.py
224
  howpublished={\url{https://huggingface.co/[Austin207]/map-neo-mini}},
225
  note={Trained on NVIDIA RTX 5070 with RefinedWeb data}
226
  }
227
- ---
228
 
229
  ## Technical Details
230
 
231
  ### Files Structure
232
- ---
233
  map-neo-mini/
234
  β”œβ”€β”€ config.json # Model configuration
235
  β”œβ”€β”€ pytorch_model.bin # Model weights
@@ -239,7 +239,7 @@ map-neo-mini/
239
  β”œβ”€β”€ vocab.json # Vocabulary
240
  β”œβ”€β”€ merges.txt # BPE merges
241
  └── model_neo.py # Model architecture code
242
- ---
243
 
244
  ### Hardware Requirements
245
  - **Minimum**: 4GB VRAM for inference
 
53
 
54
  ## Key Features
55
 
56
+ - **Efficient Training**: Trained on RTX 5070 (8GB VRAM) in ~4 hours
57
+ - **Extended Context**: 16,384 token context window (16x typical small models)
58
+ - **Memory Efficient**: Only 1.3GB VRAM for 1,800 tokens inference
59
+ - **Fast Inference**: ~10 tokens/second on consumer GPU
60
+ - **High Quality Data**: Trained on curated RefinedWeb subset
61
 
62
  ## Architecture Details
63
 
 
136
  ## Usage
137
 
138
  ### Quick Start
139
+ ```
140
  import torch
141
  from transformers import AutoTokenizer
142
  from model_neo import NeoMini, NeoMiniConfig
 
157
  with torch.no_grad():
158
  output = model.generate(input_ids, max_length=100, temperature=0.8)
159
  print(tokenizer.decode(output))
160
+ ```
161
  ### Interactive Chat
162
+ ```
163
  python interactive_chat.py
164
+ ```
165
 
166
  ### Generation Parameters
167
  - **Temperature**: 0.7-0.9 for creative tasks, 0.3-0.5 for factual
 
204
  ## Environmental Impact
205
 
206
  ### Carbon Footprint
207
+ - **Training Hardware**: Single RTX 5070 Laptop GPU (100W)
208
  - **Training Time**: 4 hours
209
  - **Estimated COβ‚‚**: ~0.3 kg COβ‚‚ equivalent
210
  - **Efficiency**: 253M parameters per 0.3 kg COβ‚‚
 
216
 
217
  ## Citation
218
 
219
+ ```
220
  @misc{mapneo_mini_2025,
221
  title={MAP-NEO Mini: An Efficient 253M Parameter Language Model},
222
  author={[Antony Austin]},
 
224
  howpublished={\url{https://huggingface.co/[Austin207]/map-neo-mini}},
225
  note={Trained on NVIDIA RTX 5070 with RefinedWeb data}
226
  }
227
+ ```
228
 
229
  ## Technical Details
230
 
231
  ### Files Structure
232
+ ```
233
  map-neo-mini/
234
  β”œβ”€β”€ config.json # Model configuration
235
  β”œβ”€β”€ pytorch_model.bin # Model weights
 
239
  β”œβ”€β”€ vocab.json # Vocabulary
240
  β”œβ”€β”€ merges.txt # BPE merges
241
  └── model_neo.py # Model architecture code
242
+ ```
243
 
244
  ### Hardware Requirements
245
  - **Minimum**: 4GB VRAM for inference