Pretrained models from the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"
Zayd Muhammad Kawakibi Zuhri PRO
zaydzuhri
AI & ML interests
I really like watching loss go down
Recent Activity
updated
a model
about 17 hours ago
zaydzuhri/top-340M-window1024-4096-batch128-steps100000-20251118-062653
updated
a dataset
1 day ago
zaydzuhri/hendrycks_math_text
published
a dataset
1 day ago
zaydzuhri/hendrycks_math_text
Organizations
None yet