Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment Paper • 2403.18811 • Published Mar 27, 2024
X-Actor: Emotional and Expressive Long-Range Portrait Acting from Audio Paper • 2508.02944 • Published Aug 4
X-UniMotion: Animating Human Images with Expressive, Unified and Identity-Agnostic Motion Latents Paper • 2508.09383 • Published Aug 12 • 1
X-Streamer: Unified Human World Modeling with Audiovisual Interaction Paper • 2509.21574 • Published Sep 25 • 7
X-Streamer: Unified Human World Modeling with Audiovisual Interaction Paper • 2509.21574 • Published Sep 25 • 7 • 3
X-Streamer: Unified Human World Modeling with Audiovisual Interaction Paper • 2509.21574 • Published Sep 25 • 7
Lynx: Towards High-Fidelity Personalized Video Generation Paper • 2509.15496 • Published Sep 19 • 12 • 4
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Paper • 2408.12528 • Published Aug 22, 2024 • 51
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models Paper • 2404.03118 • Published Apr 3, 2024 • 26
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper • 2404.02905 • Published Apr 3, 2024 • 74
CameraCtrl: Enabling Camera Control for Text-to-Video Generation Paper • 2404.02101 • Published Apr 2, 2024 • 24
LITA: Language Instructed Temporal-Localization Assistant Paper • 2403.19046 • Published Mar 27, 2024 • 19