bahree commited on
Commit
4761a96
Β·
verified Β·
1 Parent(s): dd2d6ac

Add model card

Browse files
Files changed (1) hide show
  1. README.md +327 -0
README.md ADDED
@@ -0,0 +1,327 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ language:
6
+ - en
7
+ tags:
8
+ - gpt2
9
+ - historical
10
+ - london
11
+ - slm
12
+ - small-language-model
13
+ - text-generation
14
+ - history
15
+ - english
16
+ - safetensors
17
+ ---
18
+
19
+ # London Historical LLM – Small Language Model (SLM)
20
+
21
+ A compact GPT-2 Small model (~117M params) **trained from scratch** on historical London texts (1500–1850). Fast to run on CPU, and supports NVIDIA (CUDA) and AMD (ROCm) GPUs.
22
+
23
+ > **Note**: This model was **trained from scratch** - not fine-tuned from existing models.
24
+
25
+ > This page includes simple **virtual-env setup**, **install choices for CPU/CUDA/ROCm**, and an **auto-device inference** example so anyone can get going quickly.
26
+
27
+ ---
28
+
29
+ ## πŸ”Ž Model Description
30
+
31
+ This is a **Small Language Model (SLM)** version of the London Historical LLM, **trained from scratch** using GPT-2 Small architecture on historical London texts with a custom historical tokenizer. The model was built from the ground up, not fine-tuned from existing models.
32
+
33
+ ### Key Features
34
+ - ~117M parameters (vs ~354M in the full model)
35
+ - Custom historical tokenizer (β‰ˆ30k vocab)
36
+ - London-specific context awareness and historical language patterns (e.g., *thou, thee, hath*)
37
+ - Lower memory footprint and faster inference on commodity hardware
38
+ - **Trained from scratch** - not fine-tuned from existing models
39
+
40
+ ---
41
+
42
+ ## πŸ§ͺ Intended Use & Limitations
43
+
44
+ **Use cases:** historical-style narrative generation, prompt-based exploration of London themes (1500–1850), creative writing aids.
45
+ **Limitations:** may produce anachronisms or historically inaccurate statements; smaller models have less complex reasoning than larger LLMs. Validate outputs before downstream use.
46
+
47
+ ---
48
+
49
+ ## 🐍 Set up a virtual environment (Linux/macOS/Windows)
50
+
51
+ > Virtual environments isolate project dependencies. Official Python docs: `venv`.
52
+
53
+ **Check Python & pip**
54
+ ```bash
55
+ # Linux/macOS
56
+ python3 --version && python3 -m pip --version
57
+ ```
58
+
59
+ ```powershell
60
+ # Windows (PowerShell)
61
+ python --version; python -m pip --version
62
+ ```
63
+
64
+ **Create the env**
65
+
66
+ ```bash
67
+ # Linux/macOS
68
+ python3 -m venv .venv
69
+ ```
70
+
71
+ ```powershell
72
+ # Windows (PowerShell)
73
+ python -m venv .venv
74
+ ```
75
+
76
+ ```cmd
77
+ :: Windows (Command Prompt)
78
+ python -m venv .venv
79
+ ```
80
+
81
+ **Activate**
82
+
83
+ ```bash
84
+ # Linux/macOS
85
+ source .venv/bin/activate
86
+ ```
87
+
88
+ ```powershell
89
+ # Windows (PowerShell)
90
+ .\.venv\Scripts\Activate.ps1
91
+ ```
92
+
93
+ ```cmd
94
+ :: Windows (CMD)
95
+ .\.venv\Scripts\activate.bat
96
+ ```
97
+
98
+ > If PowerShell blocks activation (*"running scripts is disabled"*), set the policy then retry activation:
99
+
100
+ ```powershell
101
+ Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned
102
+ # or just for this session:
103
+ Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
104
+ ```
105
+
106
+ ---
107
+
108
+ ## πŸ“¦ Install libraries
109
+
110
+ Upgrade basics, then install Hugging Face libs:
111
+
112
+ ```bash
113
+ python -m pip install -U pip setuptools wheel
114
+ python -m pip install "transformers" "accelerate" "safetensors"
115
+ ```
116
+
117
+ ---
118
+
119
+ ## βš™οΈ Install **one** PyTorch variant (CPU / NVIDIA / AMD)
120
+
121
+ Use **one** of the commands below. For the most accurate command per OS/accelerator and version, prefer PyTorch's **Get Started** selector.
122
+
123
+ ### A) CPU-only (Linux/Windows/macOS)
124
+
125
+ ```bash
126
+ pip install torch --index-url https://download.pytorch.org/whl/cpu
127
+ ```
128
+
129
+ ### B) NVIDIA GPU (CUDA)
130
+
131
+ Pick the CUDA series that matches your system (examples below):
132
+
133
+ ```bash
134
+ # CUDA 12.6
135
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
136
+
137
+ # CUDA 12.4
138
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
139
+
140
+ # CUDA 11.8
141
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
142
+ ```
143
+
144
+ ### C) AMD GPU (ROCm, **Linux-only**)
145
+
146
+ Install the ROCm build matching your ROCm runtime (examples):
147
+
148
+ ```bash
149
+ # ROCm 6.3
150
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.3
151
+
152
+ # ROCm 6.2 (incl. 6.2.x)
153
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2.4
154
+
155
+ # ROCm 6.1
156
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.1
157
+ ```
158
+
159
+ **Quick sanity check**
160
+
161
+ ```bash
162
+ python - <<'PY'
163
+ import torch
164
+ print("torch:", torch.__version__)
165
+ print("GPU available:", torch.cuda.is_available())
166
+ if torch.cuda.is_available():
167
+ print("device:", torch.cuda.get_device_name(0))
168
+ PY
169
+ ```
170
+
171
+ ---
172
+
173
+ ## πŸš€ Inference (auto-detect device)
174
+
175
+ This snippet picks the best device (CUDA/ROCm if available, else CPU) and uses sensible generation defaults for this SLM.
176
+
177
+ ```python
178
+ from transformers import AutoTokenizer, AutoModelForCausalLM
179
+ import torch
180
+
181
+ model_id = "bahree/london-historical-slm"
182
+
183
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
184
+ model = AutoModelForCausalLM.from_pretrained(model_id)
185
+
186
+ device = "cuda" if torch.cuda.is_available() else "cpu"
187
+ model = model.to(device)
188
+
189
+ prompt = "In the year 1834, I walked through the streets of London and witnessed"
190
+ inputs = tokenizer(prompt, return_tensors="pt").to(device)
191
+
192
+ outputs = model.generate(
193
+ inputs["input_ids"],
194
+ max_new_tokens=50,
195
+ do_sample=True,
196
+ temperature=0.8,
197
+ top_p=0.95,
198
+ top_k=40,
199
+ repetition_penalty=1.2,
200
+ no_repeat_ngram_size=3,
201
+ pad_token_id=tokenizer.eos_token_id,
202
+ eos_token_id=tokenizer.eos_token_id,
203
+ early_stopping=True,
204
+ )
205
+
206
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
207
+ ```
208
+
209
+ **Need more headroom later?** Load with πŸ€— Accelerate and `device_map="auto"` to spread layers across available devices/CPU automatically.
210
+
211
+ ```python
212
+ from transformers import AutoTokenizer, AutoModelForCausalLM
213
+ tok = AutoTokenizer.from_pretrained(model_id)
214
+ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
215
+ ```
216
+
217
+ ---
218
+
219
+ ## πŸͺŸ Windows Terminal one-liners
220
+
221
+ **PowerShell**
222
+
223
+ ```powershell
224
+ python -c "from transformers import AutoTokenizer,AutoModelForCausalLM; m='bahree/london-historical-slm'; t=AutoTokenizer.from_pretrained(m); model=AutoModelForCausalLM.from_pretrained(m); p='In the year 1834, I walked through the streets of London and witnessed'; i=t(p,return_tensors='pt'); print(t.decode(model.generate(i['input_ids'],max_new_tokens=50,do_sample=True)[0],skip_special_tokens=True))"
225
+ ```
226
+
227
+ **Command Prompt (CMD)**
228
+
229
+ ```cmd
230
+ python -c "from transformers import AutoTokenizer, AutoModelForCausalLM ^&^& import torch ^&^& m='bahree/london-historical-slm' ^&^& t=AutoTokenizer.from_pretrained(m) ^&^& model=AutoModelForCausalLM.from_pretrained(m) ^&^& p='In the year 1834, I walked through the streets of London and witnessed' ^&^& i=t(p, return_tensors='pt') ^&^& print(t.decode(model.generate(i['input_ids'], max_new_tokens=50, do_sample=True)[0], skip_special_tokens=True))"
231
+ ```
232
+
233
+ ---
234
+
235
+ ## πŸ’‘ Basic Usage (Python)
236
+
237
+ ```python
238
+ from transformers import AutoTokenizer, AutoModelForCausalLM
239
+
240
+ tokenizer = AutoTokenizer.from_pretrained("bahree/london-historical-slm")
241
+ model = AutoModelForCausalLM.from_pretrained("bahree/london-historical-slm")
242
+
243
+ if tokenizer.pad_token is None:
244
+ tokenizer.pad_token = tokenizer.eos_token
245
+
246
+ prompt = "In the year 1834, I walked through the streets of London and witnessed"
247
+ inputs = tokenizer(prompt, return_tensors="pt")
248
+ outputs = model.generate(
249
+ inputs["input_ids"],
250
+ max_new_tokens=50,
251
+ do_sample=True,
252
+ temperature=0.8,
253
+ top_p=0.95,
254
+ top_k=40,
255
+ repetition_penalty=1.2,
256
+ no_repeat_ngram_size=3,
257
+ pad_token_id=tokenizer.pad_token_id,
258
+ eos_token_id=tokenizer.eos_token_id,
259
+ early_stopping=True,
260
+ )
261
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
262
+ ```
263
+
264
+ ---
265
+
266
+ ## 🧰 Example Prompts
267
+
268
+ * **Tudor (1558):** "On this day in 1558, Queen Mary has died and …"
269
+ * **Stuart (1666):** "The Great Fire of London has consumed much of the city, and …"
270
+ * **Georgian/Victorian:** "As I journeyed through the streets of London, I observed …"
271
+ * **London specifics:** "Parliament sat in Westminster Hall …", "The Thames flowed dark and mysterious …"
272
+
273
+ ---
274
+
275
+ ## πŸ› οΈ Training Details
276
+
277
+ * **Architecture:** GPT-2 Small (12 layers, hidden size 768)
278
+ * **Params:** ~117M
279
+ * **Tokenizer:** custom historical tokenizer (~30k vocab) with London-specific and historical tokens
280
+ * **Data:** historical London corpus (1500–1850)
281
+ * **Steps/Epochs:** 30,000 steps (extended training for better convergence)
282
+ * **Batch/LR:** 32, 3e-4 (optimized for segmented data)
283
+ * **Hardware:** 2Γ— GPU training with Distributed Data Parallel
284
+ * **Final Training Loss:** 1.395 (43% improvement from 20K steps)
285
+ * **Model Flops Utilization:** 3.5% (excellent efficiency)
286
+ * **Training Method:** **Trained from scratch** - not fine-tuned
287
+
288
+ ---
289
+
290
+ ## πŸ”€ Historical Tokenizer
291
+
292
+ * Compact 30k vocab targeting 1500–1850 English
293
+ * Tokens for **year/date/name/place/title**, plus **thames**, **westminster**, etc.; includes **thou/thee/hath/doth** style markers
294
+
295
+ ---
296
+
297
+ ## ⚠️ Troubleshooting
298
+
299
+ * **`ImportError: AutoModelForCausalLM requires the PyTorch library`**
300
+ β†’ Install PyTorch with the correct accelerator variant (see CPU/CUDA/ROCm above or use the official selector).
301
+
302
+ * **AMD GPU not used**
303
+ β†’ Ensure you installed a ROCm build and you're on Linux (`pip install ... --index-url https://download.pytorch.org/whl/rocmX.Y`). Verify with `torch.cuda.is_available()` and check the device name. ROCm wheels are Linux-only.
304
+
305
+ * **Running out of VRAM**
306
+ β†’ Try smaller batch/sequence lengths, or load with `device_map="auto"` via πŸ€— Accelerate to offload layers to CPU/disk.
307
+
308
+ ---
309
+
310
+ ## πŸ“š Citation
311
+
312
+ If you use this model, please cite:
313
+
314
+ ```bibtex
315
+ @misc{london-historical-slm,
316
+ title = {London Historical LLM - Small Language Model: A Compact GPT-2 for Historical Text Generation},
317
+ author = {Amit Bahree},
318
+ year = {2025},
319
+ url = {https://huggingface.co/bahree/london-historical-slm}
320
+ }
321
+ ```
322
+
323
+ ---
324
+
325
+ ## 🧾 License
326
+
327
+ MIT (see `LICENSE` in repo).