RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models Paper • 2510.25257 • Published 8 days ago • 1
RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models Paper • 2510.25257 • Published 8 days ago • 1
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation Paper • 2303.13399 • Published Mar 23, 2023
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer Paper • 2407.17140 • Published Jul 24, 2024 • 2
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22 • 90
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16, 2024 • 31