Main revision

by aldakata - opened Oct 7

Oct 7

Hi!
I was playing around with the revisions and I get different results with the main and the stage2-ingredient3-step23852-tokens51B revision.
Shouldn't these be the exact same model according to https://github.com/allenai/OLMo?

For the 1B model, we have trained three times with different data order on 50B high quality tokens, used last checkpoint of seed 42 as final checkpoint.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment