OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 22 days ago • 86
NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints Paper • 2510.08565 • Published 30 days ago • 19
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published 30 days ago • 108
Factuality Matters: When Image Generation and Editing Meet Structured Visuals Paper • 2510.05091 • Published Oct 6 • 18
OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview Image-Text-to-Text • 0.4B • Updated Aug 29 • 41.3k • 76
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 202
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 202