Multimodal - a JM-Brun Collection

JM-Brun 's Collections

RL

Diffusion models

Prompt Optimization

Tabular

Agents

SLMs

LLM-KG

LLM Architecture

Interpretability XAI

Multimodal

updated Jul 25

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 80
OmniGen2: Exploration to Advanced Multimodal Generation

Paper • 2506.18871 • Published Jun 23 • 77
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation

Paper • 2506.17202 • Published Jun 20 • 10
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation

Paper • 2506.18095 • Published Jun 22 • 66
Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29 • 62
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2 • 38
Pixels, Patterns, but No Poetry: To See The World like Humans

Paper • 2507.16863 • Published Jul 21 • 68