Main revision

#5
by aldakata - opened

Hi!
I was playing around with the revisions and I get different results with the main and the stage2-ingredient3-step23852-tokens51B revision.
Shouldn't these be the exact same model according to https://github.com/allenai/OLMo?

For the 1B model, we have trained three times with different data order on 50B high quality tokens, used last checkpoint of seed 42 as final checkpoint.

Sign up or log in to comment