view article Article How to make NeuTTS-air generate over 200 seconds of audio in a single second. Nov 21, 2025 • 22
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models Paper • 2510.17519 • Published Oct 20, 2025 • 9
pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation Paper • 2510.14974 • Published Oct 16, 2025 • 9
Stable Video Infinity: Infinite-Length Video Generation with Error Recycling Paper • 2510.09212 • Published Oct 10, 2025 • 17
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7, 2025 • 54
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation Paper • 2510.01284 • Published Sep 30, 2025 • 34
MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech Paper • 2509.25131 • Published Sep 29, 2025 • 15
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer Paper • 2509.24695 • Published Sep 29, 2025 • 44
SVDQuant Collection Models and datasets for "SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models" • 20 items • Updated May 29, 2025 • 64
LPD Collection Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation • 6 items • Updated Jul 2, 2025 • 2
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs Paper • 2509.08358 • Published Sep 10, 2025 • 13
Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling Paper • 2509.01624 • Published Sep 1, 2025 • 7