SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation Paper • 2510.06303 • Published Oct 7 • 15
SDAR Collection The models without suffixes use the default block size = 4. • 21 items • Updated Sep 9 • 6
SDAR Collection The models without suffixes use the default block size = 4. • 21 items • Updated Sep 9 • 6
Unlocking Continual Learning Abilities in Language Models Paper • 2406.17245 • Published Jun 25, 2024 • 30