A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning Paper • 2509.15937 • Published Sep 19 • 20
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds Paper • 2508.14879 • Published Aug 20 • 65
ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting Paper • 2507.15454 • Published Jul 21 • 7
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation Paper • 2308.07498 • Published Aug 14, 2023
Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection Paper • 2502.01401 • Published Feb 3 • 1
NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance Paper • 2505.08712 • Published May 13 • 6
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7 • 47
GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scenes Paper • 2505.20294 • Published May 26 • 4
NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance Paper • 2505.08712 • Published May 13 • 6
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy Paper • 2503.19757 • Published Mar 25 • 51
Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation Paper • 2503.13424 • Published Mar 17 • 30