File size: 11,763 Bytes
4761a96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e175925
 
 
4761a96
 
 
 
 
 
 
 
e175925
 
 
 
 
aa4849a
 
 
4761a96
 
e175925
4761a96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
190c461
4761a96
 
 
 
190c461
4761a96
 
 
 
190c461
4761a96
 
190c461
 
4761a96
 
 
 
190c461
4761a96
 
 
 
190c461
4761a96
 
 
 
190c461
4761a96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e175925
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4761a96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e175925
4761a96
e175925
4761a96
 
 
190c461
4761a96
e175925
190c461
e175925
190c461
 
4761a96
190c461
 
 
 
 
 
 
 
 
 
 
 
 
 
e175925
4761a96
 
e175925
190c461
 
 
4761a96
190c461
 
 
4761a96
e175925
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4761a96
 
 
 
e175925
 
 
 
 
 
 
 
 
 
 
 
4761a96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e175925
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4761a96
 
e175925
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
---
license: mit
library_name: transformers
pipeline_tag: text-generation
language:
- en
tags:
- gpt2
- historical
- london
- slm
- small-language-model
- text-generation
- history
- english
- safetensors
---

# London Historical LLM – Small Language Model (SLM)

A compact GPT-2 Small model (~117M params) **trained from scratch** on historical London texts (1500–1850). Fast to run on CPU, and supports NVIDIA (CUDA) and AMD (ROCm) GPUs.

> **Note**: This model was **trained from scratch** - not fine-tuned from existing models.

> This page includes simple **virtual-env setup**, **install choices for CPU/CUDA/ROCm**, and an **auto-device inference** example so anyone can get going quickly.

---

## πŸ”Ž Model Description

This is a **Small Language Model (SLM)** version of the London Historical LLM, **trained from scratch** using GPT-2 Small architecture on historical London texts with a custom historical tokenizer. The model was built from the ground up, not fine-tuned from existing models.

### Key Features
- ~117M parameters (vs ~354M in the full model)  
- Custom historical tokenizer (β‰ˆ30k vocab)  
- London-specific context awareness and historical language patterns (e.g., *thou, thee, hath*)  
- Lower memory footprint and faster inference on commodity hardware  
- **Trained from scratch** - not fine-tuned from existing models  

---

## πŸ§ͺ Intended Use & Limitations

**Use cases:** historical-style narrative generation, prompt-based exploration of London themes (1500–1850), creative writing aids.  
**Limitations:** may produce anachronisms or historically inaccurate statements; smaller models have less complex reasoning than larger LLMs. Validate outputs before downstream use.

---

## 🐍 Set up a virtual environment (Linux/macOS/Windows)

> Virtual environments isolate project dependencies. Official Python docs: `venv`.

**Check Python & pip**
```bash
# Linux/macOS
python3 --version && python3 -m pip --version
```

```powershell
# Windows (PowerShell)
python --version; python -m pip --version
```

**Create the env**

```bash
# Linux/macOS
python3 -m venv helloLondon
```

```powershell
# Windows (PowerShell)
python -m venv helloLondon
```

```cmd
:: Windows (Command Prompt)
python -m venv helloLondon
```

> **Note**: You can name your virtual environment anything you like, e.g., `.venv`, `my_env`, `london_env`.

**Activate**

```bash
# Linux/macOS
source helloLondon/bin/activate
```

```powershell
# Windows (PowerShell)
.\helloLondon\Scripts\Activate.ps1
```

```cmd
:: Windows (CMD)
.\helloLondon\Scripts\activate.bat
```

> If PowerShell blocks activation (*"running scripts is disabled"*), set the policy then retry activation:

```powershell
Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned
# or just for this session:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
```

---

## πŸ“¦ Install libraries

Upgrade basics, then install Hugging Face libs:

```bash
python -m pip install -U pip setuptools wheel
python -m pip install "transformers" "accelerate" "safetensors"
```

---

## Install **one** PyTorch variant (CPU / NVIDIA / AMD)

Use **one** of the commands below. For the most accurate command per OS/accelerator and version, prefer PyTorch's **Get Started** selector.

### A) CPU-only (Linux/Windows/macOS)

```bash
pip install torch --index-url https://download.pytorch.org/whl/cpu
```

### B) NVIDIA GPU (CUDA)

Pick the CUDA series that matches your system (examples below):

```bash
# CUDA 12.6
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

# CUDA 12.4
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# CUDA 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
```

### C) AMD GPU (ROCm, **Linux-only**)

Install the ROCm build matching your ROCm runtime (examples):

```bash
# ROCm 6.3
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.3

# ROCm 6.2 (incl. 6.2.x)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2.4

# ROCm 6.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.1
```

**Quick sanity check**

```bash
python - <<'PY'
import torch
print("torch:", torch.__version__)
print("GPU available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("device:", torch.cuda.get_device_name(0))
PY
```

---

## πŸš€ Inference (auto-detect device)

This snippet picks the best device (CUDA/ROCm if available, else CPU) and uses sensible generation defaults for this SLM.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "bahree/london-historical-slm"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

prompt = "In the year 1834, I walked through the streets of London and witnessed"
inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = model.generate(
    inputs["input_ids"],
    max_new_tokens=50,
    do_sample=True,
    temperature=0.8,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.2,
    no_repeat_ngram_size=3,
    pad_token_id=tokenizer.eos_token_id,
    eos_token_id=tokenizer.eos_token_id,
    early_stopping=True,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## πŸ§ͺ **Testing Your Model**

### **Quick Testing (10 Automated Prompts)**
```bash
# Test with 10 automated historical prompts
python 06_inference/test_published_models.py --model_type slm
```

**Expected Output:**
```
πŸ§ͺ Testing SLM Model: bahree/london-historical-slm
============================================================
πŸ“‚ Loading model...
βœ… Model loaded in 8.91 seconds
πŸ“Š Model Info:
   Type: SLM
   Description: Small Language Model (117M parameters)
   Device: cuda
   Vocabulary size: 30,000
   Max length: 512

🎯 Testing generation with 10 prompts...
[10 automated tests with historical text generation]
```

### **Interactive Testing**
```bash
# Interactive mode for custom prompts
python 06_inference/inference_unified.py --published --model_type slm --interactive

# Single prompt test
python 06_inference/inference_unified.py --published --model_type slm --prompt "In the year 1834, I walked through the streets of London and witnessed"
```

**Need more headroom later?** Load with πŸ€— Accelerate and `device_map="auto"` to spread layers across available devices/CPU automatically.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
```

---

## πŸͺŸ Windows Terminal one-liners

**PowerShell**

```powershell
python -c "from transformers import AutoTokenizer,AutoModelForCausalLM; m='bahree/london-historical-slm'; t=AutoTokenizer.from_pretrained(m); model=AutoModelForCausalLM.from_pretrained(m); p='In the year 1834, I walked through the streets of London and witnessed'; i=t(p,return_tensors='pt'); print(t.decode(model.generate(i['input_ids'],max_new_tokens=50,do_sample=True)[0],skip_special_tokens=True))"
```

**Command Prompt (CMD)**

```cmd
python -c "from transformers import AutoTokenizer, AutoModelForCausalLM ^&^& import torch ^&^& m='bahree/london-historical-slm' ^&^& t=AutoTokenizer.from_pretrained(m) ^&^& model=AutoModelForCausalLM.from_pretrained(m) ^&^& p='In the year 1834, I walked through the streets of London and witnessed' ^&^& i=t(p, return_tensors='pt') ^&^& print(t.decode(model.generate(i['input_ids'], max_new_tokens=50, do_sample=True)[0], skip_special_tokens=True))"
```

---

## πŸ’‘ Basic Usage (Python)

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("bahree/london-historical-slm")
model = AutoModelForCausalLM.from_pretrained("bahree/london-historical-slm")

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

prompt = "In the year 1834, I walked through the streets of London and witnessed"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    inputs["input_ids"],
    max_new_tokens=50,
    do_sample=True,
    temperature=0.8,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.2,
    no_repeat_ngram_size=3,
    pad_token_id=tokenizer.pad_token_id,
    eos_token_id=tokenizer.eos_token_id,
    early_stopping=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## 🧰 Example Prompts

* **Tudor (1558):** "On this day in 1558, Queen Mary has died and …"
* **Stuart (1666):** "The Great Fire of London has consumed much of the city, and …"
* **Georgian/Victorian:** "As I journeyed through the streets of London, I observed …"
* **London specifics:** "Parliament sat in Westminster Hall …", "The Thames flowed dark and mysterious …"

---

## πŸ› οΈ Training Details

* **Architecture:** GPT-2 Small (12 layers, hidden size 768)
* **Params:** ~117M
* **Tokenizer:** custom historical tokenizer (~30k vocab) with London-specific and historical tokens
* **Data:** historical London corpus (1500–1850)
* **Steps/Epochs:** 30,000 steps (extended training for better convergence)
* **Batch/LR:** 32, 3e-4 (optimized for segmented data)
* **Hardware:** 2Γ— GPU training with Distributed Data Parallel
* **Final Training Loss:** 1.395 (43% improvement from 20K steps)
* **Model Flops Utilization:** 3.5% (excellent efficiency)
* **Training Method:** **Trained from scratch** - not fine-tuned
* **Context Length:** 256 tokens (optimized for historical text segments)
* **Status:** βœ… **Successfully published and tested** - ready for production use

---

## πŸ”€ Historical Tokenizer

* Compact 30k vocab targeting 1500–1850 English
* Tokens for **year/date/name/place/title**, plus **thames**, **westminster**, etc.; includes **thou/thee/hath/doth** style markers

---

## ⚠️ Troubleshooting

* **`ImportError: AutoModelForCausalLM requires the PyTorch library`**
  β†’ Install PyTorch with the correct accelerator variant (see CPU/CUDA/ROCm above or use the official selector).

* **AMD GPU not used**
  β†’ Ensure you installed a ROCm build and you're on Linux (`pip install ... --index-url https://download.pytorch.org/whl/rocmX.Y`). Verify with `torch.cuda.is_available()` and check the device name. ROCm wheels are Linux-only.

* **Running out of VRAM**
  β†’ Try smaller batch/sequence lengths, or load with `device_map="auto"` via πŸ€— Accelerate to offload layers to CPU/disk.

---

## πŸ“š Citation

If you use this model, please cite:

```bibtex
@misc{london-historical-slm,
  title   = {London Historical LLM - Small Language Model: A Compact GPT-2 for Historical Text Generation},
  author  = {Amit Bahree},
  year    = {2025},
  url     = {https://huggingface.co/bahree/london-historical-slm}
}
```

---

## Repository

The complete source code, training scripts, and documentation for this model are available on GitHub:

**πŸ”— [https://github.com/bahree/helloLondon](https://github.com/bahree/helloLondon)**

This repository includes:
- Complete data collection pipeline for 1500-1850 historical English
- Custom tokenizer optimized for historical text  
- Training infrastructure with GPU optimization
- Evaluation and deployment tools
- Comprehensive documentation and examples

### Quick Start with Repository
```bash
git clone https://github.com/bahree/helloLondon.git
cd helloLondon
python 06_inference/test_published_models.py --model_type slm
```

---

## 🧾 License

MIT (see [LICENSE](https://github.com/bahree/helloLondon/blob/main/LICENSE) in repo).