ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World Paper • 2505.19095 • Published May 25 • 1
UGround Collection Navigating GUIs as Humans Do: Universal Visual Grounding for GUI Agents (ICLR'25 Oral) • 10 items • Updated May 4 • 7
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15, 2024 • 189