Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
5
7
9
Catherine Arnett
catherinearnett
Follow
suchirsalhan's profile picture
romyluo7's profile picture
samueltckong's profile picture
96 followers
·
32 following
https://catherinearnett.github.io/
linguist_cat
catherinearnett
catherinearnett.bsky.social
AI & ML interests
multilingual NLP, tokenization
Recent Activity
upvoted
a
collection
1 day ago
Zh-Pythia
authored
a paper
10 days ago
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training
authored
a paper
10 days ago
Explaining and Mitigating Crosslingual Tokenizer Inequities
View all activity
Organizations
catherinearnett
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
11 days ago
mrlbenchmarks/global-piqa-nonparallel
Viewer
•
Updated
12 days ago
•
11.6k
•
2.68k
•
21
liked
a dataset
about 2 months ago
nlip/DIWALI
Viewer
•
Updated
Sep 24
•
8.82k
•
95
•
5
liked
4 datasets
4 months ago
classla/ParlaSpeech-PL
Viewer
•
Updated
Jul 2
•
531k
•
242
•
4
classla/ParlaSpeech-HR
Viewer
•
Updated
Jul 2
•
868k
•
264
•
3
classla/ParlaSpeech-CZ
Viewer
•
Updated
Jul 2
•
711k
•
75
•
4
classla/ParlaSpeech-RS
Viewer
•
Updated
Jul 2
•
278k
•
90
•
3
liked
a dataset
5 months ago
UD-Filipino/UD_Tagalog-NewsCrawl
Viewer
•
Updated
Jul 23
•
15.6k
•
72
•
1
liked
a dataset
7 months ago
jumelet/multiblimp
Viewer
•
Updated
May 16
•
121k
•
1.01k
•
15
liked
a dataset
over 1 year ago
ambean/lingOly
Viewer
•
Updated
Jun 11, 2024
•
90
•
161
•
9