Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
allegrolab 's Collections
Hubble Datasets
Hubble - Core
Hubble - Interference
Hubble - Timing
Hubble - Paraphrase
Hubble - Architecture

Hubble - Architecture

updated 27 days ago

Two models trained with shallower and deeper transformer architectures to assess how model depth affects memorization.

Upvote
-

  • allegrolab/hubble-1b-100b_toks-double_depth-perturbed-hf

    Text Generation • 1B • Updated 20 days ago • 8

  • allegrolab/hubble-1b-100b_toks-double_depth-standard-hf

    Text Generation • 1B • Updated 20 days ago • 5

  • allegrolab/hubble-1b-100b_toks-half_depth-perturbed-hf

    Text Generation • 1B • Updated 20 days ago • 5

  • allegrolab/hubble-1b-100b_toks-half_depth-standard-hf

    Text Generation • 1B • Updated 20 days ago • 5

  • allegrolab/hubble-1b-100b_toks-double_depth-perturbed-neox

    Text Generation • Updated 20 days ago

  • allegrolab/hubble-1b-100b_toks-double_depth-standard-neox

    Text Generation • Updated 20 days ago

  • allegrolab/hubble-1b-100b_toks-half_depth-perturbed-neox

    Text Generation • Updated 20 days ago

  • allegrolab/hubble-1b-100b_toks-half_depth-standard-neox

    Text Generation • Updated 20 days ago

  • allegrolab/dclm-baseline-500b_toks

    Updated 20 days ago • 161

    Note Use revision 'perturbed-100b' for the perturbed models and 'standard' otherwise

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs