UDKAG: Augmenting Large Vision-Language Models with Up-to-Date Knowledge Paper • 2405.14554 • Published May 23, 2024
GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation Paper • 2411.18499 • Published Nov 27, 2024 • 18
Multi-Sourced Compositional Generalization in Visual Question Answering Paper • 2505.23045 • Published May 29
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models Paper • 2504.05782 • Published Apr 8 • 3
SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model Paper • 2505.22126 • Published May 28 • 3
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 62
ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy Paper • 2503.06542 • Published Mar 9 • 7
A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation Paper • 2506.09427 • Published Jun 11 • 8