GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 237
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents Paper • 2510.24702 • Published 11 days ago • 24
Grounding Multilingual Multimodal LLMs With Cultural Knowledge Paper • 2508.07414 • Published Aug 10 • 1
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers Paper • 2412.00142 • Published Nov 28, 2024 • 4
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Paper • 2410.14669 • Published Oct 18, 2024 • 39
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages Paper • 2410.16153 • Published Oct 21, 2024 • 44