Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.13925

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model

Paper • 2410.13925 • Published Oct 17, 2024 • 24

about 3 hours ago

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 77

InfImagine/FiT

Updated Oct 31, 2024 • 2
InfImagine/FiTv2

Updated Oct 30, 2024 • 4
InfImagine/imagenet_features_1024_sd_vae_ft_ema

Viewer • Updated Nov 6, 2024 • 1.44M • 92 • 2
InfImagine/imagenet1k_features_256_sd_vae_ft_ema

Viewer • Updated Nov 6, 2024 • 3.09M • 43 • 2

image synthetic

FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model

Paper • 2410.13925 • Published Oct 17, 2024 • 24
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities

Paper • 2410.14672 • Published Oct 18, 2024 • 8
Scalable Ranked Preference Optimization for Text-to-Image Generation

Paper • 2410.18013 • Published Oct 23, 2024 • 15
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Paper • 2410.18666 • Published Oct 24, 2024 • 19

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

InfImagine/FiT

Updated Oct 31, 2024 • 2
InfImagine/FiTv2

Updated Oct 30, 2024 • 4
InfImagine/imagenet_features_1024_sd_vae_ft_ema

Viewer • Updated Nov 6, 2024 • 1.44M • 92 • 2
InfImagine/imagenet1k_features_256_sd_vae_ft_ema

Viewer • Updated Nov 6, 2024 • 3.09M • 43 • 2

FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model

Paper • 2410.13925 • Published Oct 17, 2024 • 24

image synthetic

FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model

Paper • 2410.13925 • Published Oct 17, 2024 • 24
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities

Paper • 2410.14672 • Published Oct 18, 2024 • 8
Scalable Ranked Preference Optimization for Text-to-Image Generation

Paper • 2410.18013 • Published Oct 23, 2024 • 15
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Paper • 2410.18666 • Published Oct 24, 2024 • 19

about 3 hours ago

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 77

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs