Vision - a AdrienRR Collection

AdrienRR 's Collections

Vision

Vision

updated Oct 3

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Paper • 2404.06512 • Published Apr 9, 2024 • 30
Adapting LLaMA Decoder to Vision Transformer

Paper • 2404.06773 • Published Apr 10, 2024 • 18
Quantized Visual Geometry Grounded Transformer

Paper • 2509.21302 • Published Sep 25 • 8
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation

Paper • 2509.24335 • Published Sep 29 • 8
VGGT-X: When VGGT Meets Dense Novel View Synthesis

Paper • 2509.25191 • Published Sep 29 • 18
LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26 • 182
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Paper • 2510.02283 • Published Oct 2 • 93