π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models Paper • 2510.25889 • Published 6 days ago • 50
Exploring Conditions for Diffusion models in Robotic Control Paper • 2510.15510 • Published 18 days ago • 39
ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published 8 days ago • 117
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published 8 days ago • 172
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 13 days ago • 44
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 22 days ago • 160
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model Paper • 2510.12276 • Published 21 days ago • 142
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation Paper • 2510.09116 • Published 25 days ago • 95
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training Paper • 2510.12586 • Published 21 days ago • 107
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published 22 days ago • 97
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 22 days ago • 172
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published 26 days ago • 121
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published 28 days ago • 135
view article Article How I Trained Action Chunking Transformer (ACT) on SO-101: My Journey, Gotchas, and Lessons By sherryxychen • Sep 30 • 37
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published 26 days ago • 62
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published 26 days ago • 108
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 138
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 170