MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16 • 26
Large Spatial Model: End-to-end Unposed Images to Semantic 3D Paper • 2410.18956 • Published Oct 24, 2024 • 1
AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving Paper • 2412.15206 • Published Dec 19, 2024
NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results Paper • 2505.03007 • Published May 5
The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report Paper • 2504.10686 • Published Apr 14
VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction Paper • 2505.20279 • Published May 26 • 5
Generative AI for Autonomous Driving: Frontiers and Opportunities Paper • 2505.08854 • Published May 13 • 1