allegrolab/hubble-1b-100b_toks-double_depth-perturbed-hf
Text Generation
•
1B
•
Updated
•
8
Two models trained with shallower and deeper transformer architectures to assess how model depth affects memorization.
Note Use revision 'perturbed-100b' for the perturbed models and 'standard' otherwise