Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models Paper • 2405.16759 • Published May 27, 2024 • 8
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting Paper • 2212.06909 • Published Dec 13, 2022
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning Paper • 2302.14115 • Published Feb 27, 2023
Gemini: A Family of Highly Capable Multimodal Models Paper • 2312.11805 • Published Dec 19, 2023 • 47
Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation Paper • 2310.18235 • Published Oct 27, 2023
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Paper • 2404.16820 • Published Apr 25, 2024 • 17
DOCCI: Descriptions of Connected and Contrasting Images Paper • 2404.19753 • Published Apr 30, 2024 • 13