File size: 843 Bytes
228f5b1
 
 
 
600604a
 
228f5b1
 
fa53776
228f5b1
639635b
228f5b1
 
 
 
 
 
639635b
228f5b1
 
 
639635b
228f5b1
 
 
 
 
 
 
639635b
600604a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
tags:
- babylm
- language-model
- gpt-bert
- multilingual
license: mit
---
# babybabellm-multi-all

This repository contains checkpoints for the **multilingual (all)** variant of **BabyBabeLLM**.

## Files
- `*_15_16.bin` – main model weights  
- `*_15_16_ema.bin` – EMA smoothed weights  
- `*_15_16_state_dict.bin` – PyTorch state dict  
- `pytorch_model.bin` – extracted EMA weights (for AutoModel)  

## Usage
```python
from transformers import AutoModel, AutoTokenizer
repo = "suchirsalhan/babybabellm-multi-all"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModel.from_pretrained(repo)
inputs = tokenizer("Hello world!", return_tensors="pt")
outputs = model(**inputs)
```
## Notes
- These are research checkpoints trained on BabyLM-style data.
- Model naming: `multiall` indicates the language/config variant.