tinychat
Collection
2 items
•
Updated
An autoregressive Transformer language model:
Trained for ~50 hours on a single RTX-3090, processing ~5B tokens from:
More details here
| tinychat | gpt2 | gpt2-medium | |
|---|---|---|---|
| swag | 47.3% | 48.9% | 56.3% |
| hellaswag | 30.7% | 29.0% | 37.0% |
| openbookqa | 32.8% | 28.0% | 30.6% |