From Masks to Worlds: A Hitchhiker's Guide to World Models Paper ⢠2510.20668 ⢠Published 15 days ago ⢠6
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent Paper ⢠2506.17612 ⢠Published Jun 21 ⢠64
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper ⢠2510.06308 ⢠Published about 1 month ago ⢠52
HiFiHR: Enhancing 3D Hand Reconstruction from a Single Image via High-Fidelity Texture Paper ⢠2308.13628 ⢠Published Aug 25, 2023
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model Paper ⢠2505.23606 ⢠Published May 29 ⢠14
Personalized Safety Alignment for Text-to-Image Diffusion Models Paper ⢠2508.01151 ⢠Published Aug 2 ⢠8
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models Paper ⢠2505.24133 ⢠Published May 30 ⢠1
Efficient Deweather Mixture-of-Experts with Uncertainty-aware Feature-wise Linear Modulation Paper ⢠2312.16610 ⢠Published Dec 27, 2023
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Paper ⢠2507.11527 ⢠Published Jul 15 ⢠32
Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models Paper ⢠2505.24164 ⢠Published May 30
UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions Paper ⢠2506.13691 ⢠Published Jun 16 ⢠2
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology Paper ⢠2507.07999 ⢠Published Jul 10 ⢠49