VideoFrom3D: 3D Scene Video Generation via Complementary Image and Video Diffusion Models Paper • 2509.17985 • Published Sep 22 • 25
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation Paper • 2509.15185 • Published Sep 18 • 29
WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion Model via Training-Free Guidance Paper • 2509.15130 • Published Sep 18 • 30
Measuring Epistemic Humility in Multimodal Large Language Models Paper • 2509.09658 • Published Sep 11 • 6
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence Paper • 2509.12203 • Published Sep 15 • 19
InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts Paper • 2509.10813 • Published Sep 13 • 30
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15 • 103
CognitiveSky: Scalable Sentiment and Narrative Analysis for Decentralized Social Media Paper • 2509.11444 • Published Sep 14 • 3
X-Part: high fidelity and structure coherent shape decomposition Paper • 2509.08643 • Published Sep 10 • 26