ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation Paper • 2510.08551 • Published Oct 9 • 31
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 137
BRIDGE - Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation Paper • 2509.25077 • Published Sep 29 • 14
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning Paper • 2509.22281 • Published Sep 26 • 31
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning Paper • 2509.22647 • Published Sep 26 • 32
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15 • 103
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control Paper • 2508.21112 • Published Aug 28 • 75
ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting Paper • 2507.15454 • Published Jul 21 • 7
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data Paper • 2507.07095 • Published Jul 9 • 54
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7 • 47
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Paper • 2506.18882 • Published Jun 23 • 89
AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views Paper • 2505.23716 • Published May 29 • 31
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation Paper • 2505.21491 • Published May 27 • 17
SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations Paper • 2505.02094 • Published May 4 • 19