-
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
Paper • 2508.07901 • Published • 39 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37
Collections
Discover the best community collections!
Collections including paper arxiv:2508.17437
-
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper • 2508.14879 • Published • 67 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 41 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Multi-View 3D Point Tracking
Paper • 2508.21060 • Published • 22
-
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 41 -
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion
Paper • 2407.13759 • Published • 18
-
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
Paper • 2508.07981 • Published • 58 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Paper • 2508.10881 • Published • 52 -
Puppeteer: Rig and Animate Your 3D Models
Paper • 2508.10898 • Published • 31
-
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Paper • 2509.06951 • Published • 31
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 32 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 33 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14
-
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
Paper • 2508.07901 • Published • 39 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37
-
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper • 2508.14879 • Published • 67 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 41 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Multi-View 3D Point Tracking
Paper • 2508.21060 • Published • 22
-
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
Paper • 2508.07981 • Published • 58 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Paper • 2508.10881 • Published • 52 -
Puppeteer: Rig and Animate Your 3D Models
Paper • 2508.10898 • Published • 31
-
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Paper • 2509.06951 • Published • 31
-
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 41 -
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion
Paper • 2407.13759 • Published • 18
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 32 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 33 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14