FlameF0X/i3-tiny
Text Generation
•
711k
•
Updated
•
24
•
1
Note: The models are listed in the default order set by Hugging Face, so the latest model appears at the bottom.
Note the first i3 architecture LM
Note Our first usable i3 model (meaning that we added Transformers support and some code for it)
Note Smol stable text generator that took over 14 hours to pre-train :) --- Changes --- Trained on over 1T tokens LoRPt layers
Note SOTA model. Pre-trained in around 2 to 4 hours, in comparison with the previous version of over 14 hours. --- Changes --- Trained on over 3T tokens Other stuff available to read in the model card.
Try i3-80m, a SOTA efficient training LM arhitecture.
Note Our first space for i3-80m.