Efficient Generative Model Training via Embedded Representation Warmup Paper • 2504.10188 • Published Apr 14 • 12
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 66