Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.17437

Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation

Paper • 2508.07901 • Published Aug 11 • 39
CharacterShot: Controllable and Consistent 4D Character Animation

Paper • 2508.07409 • Published Aug 10 • 39
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Paper • 2508.14879 • Published Aug 20 • 67
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space

Paper • 2508.19247 • Published Aug 26 • 41
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37
Multi-View 3D Point Tracking

Paper • 2508.21060 • Published Aug 28 • 22

Arrexel/pattern-diffusion

Text-to-Image • Updated Aug 8 • 129 • 106
stepfun-ai/NextStep-1-Large

Text-to-Image • 15B • Updated Aug 19 • 77 • 95
facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated Aug 19 • 21.9k • 186
Skywork/Matrix-Game-2.0

Image-to-Video • Updated Aug 21 • 272

(Urban) World Model

SynCity: Training-Free Generation of 3D Worlds

Paper • 2503.16420 • Published Mar 20 • 27
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities

Paper • 2501.08983 • Published Jan 15 • 22
WorldGrow: Generating Infinite 3D World

Paper • 2510.21682 • Published 26 days ago • 41
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion

Paper • 2407.13759 • Published Jul 18, 2024 • 18

Pixie Pixel to physics

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

Paper • 2508.07981 • Published Aug 11 • 58
CharacterShot: Controllable and Consistent 4D Character Animation

Paper • 2508.07409 • Published Aug 10 • 39
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

Paper • 2508.10881 • Published Aug 14 • 52
Puppeteer: Rig and Animate Your 3D Models

Paper • 2508.10898 • Published Aug 14 • 31

Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

Paper • 2509.00428 • Published Aug 30 • 17
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8 • 31

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 32
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12 • 19
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14 • 33
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20 • 14

Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation

Paper • 2508.07901 • Published Aug 11 • 39
CharacterShot: Controllable and Consistent 4D Character Animation

Paper • 2508.07409 • Published Aug 10 • 39
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

Pixie Pixel to physics

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Paper • 2508.14879 • Published Aug 20 • 67
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space

Paper • 2508.19247 • Published Aug 26 • 41
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37
Multi-View 3D Point Tracking

Paper • 2508.21060 • Published Aug 28 • 22

Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

Paper • 2508.07981 • Published Aug 11 • 58
CharacterShot: Controllable and Consistent 4D Character Animation

Paper • 2508.07409 • Published Aug 10 • 39
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

Paper • 2508.10881 • Published Aug 14 • 52
Puppeteer: Rig and Animate Your 3D Models

Paper • 2508.10898 • Published Aug 14 • 31

Arrexel/pattern-diffusion

Text-to-Image • Updated Aug 8 • 129 • 106
stepfun-ai/NextStep-1-Large

Text-to-Image • 15B • Updated Aug 19 • 77 • 95
facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated Aug 19 • 21.9k • 186
Skywork/Matrix-Game-2.0

Image-to-Video • Updated Aug 21 • 272

Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

Paper • 2509.00428 • Published Aug 30 • 17
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8 • 31

(Urban) World Model

SynCity: Training-Free Generation of 3D Worlds

Paper • 2503.16420 • Published Mar 20 • 27
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities

Paper • 2501.08983 • Published Jan 15 • 22
WorldGrow: Generating Infinite 3D World

Paper • 2510.21682 • Published 26 days ago • 41
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion

Paper • 2407.13759 • Published Jul 18, 2024 • 18

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 32
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12 • 19
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14 • 33
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20 • 14

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs