Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
carlizor
's Collections
Agents
Multi lora spaces
TTS
Utilities
Document retrieval / chat
Flux
Image restoration
3D Generation
LLM
Embedding
LLM - Small
Video vision
To Read
Video
Image Segmentation
Image Generation (Fast)
Image Depth
Image caption
Audio
Image Generation
Image that talks
Image Enhance
Image Vision
Image editing
Image upscaling
Face Recognition
Multimodal
LLM - Medium
Image Vision
updated
Aug 12
Upvote
-
Salesforce/xgen-mm-phi3-mini-instruct-r-v1
Image-Text-to-Text
•
5B
•
Updated
Feb 3
•
689
•
185
AIDC-AI/Ovis1.6-Gemma2-9B
Image-Text-to-Text
•
10B
•
Updated
Aug 15
•
262
•
275
nvidia/NVLM-D-72B
Image-Text-to-Text
•
79B
•
Updated
Jan 14
•
52.9k
•
774
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
391
•
1.7k
deepseek-ai/Janus-1.3B
Any-to-Any
•
2B
•
Updated
Jan 27
•
11k
•
592
deepseek-ai/JanusFlow-1.3B
Any-to-Any
•
2B
•
Updated
Jan 27
•
582
•
151
NexaAI/OmniVLM-968M
0.5B
•
Updated
Aug 20
•
2.34k
•
527
vikhyatk/moondream2
Image-Text-to-Text
•
2B
•
Updated
Sep 23
•
1.58M
•
1.35k
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
0.7B
•
Updated
Feb 4
•
21.1k
•
1.53k
jiuhai/florence-vl-8b-sft
9B
•
Updated
Dec 3, 2024
•
9
•
21
AI-Safeguard/Ivy-VL-llava
Visual Question Answering
•
4B
•
Updated
Apr 28
•
154
•
71
OpenGVLab/InternVL2_5-78B
Image-Text-to-Text
•
78B
•
Updated
Sep 11
•
546
•
192
Qwen/QVQ-72B-Preview
Image-Text-to-Text
•
73B
•
Updated
Jan 12
•
406
•
609
deepseek-ai/deepseek-vl2
Image-Text-to-Text
•
27B
•
Updated
Dec 18, 2024
•
43.4k
•
369
allenai/Molmo-7B-D-0924
Image-Text-to-Text
•
8B
•
Updated
Oct 9
•
37.1k
•
554
prithivMLmods/Qwen2-VL-OCR-2B-Instruct
Image-Text-to-Text
•
2B
•
Updated
May 2
•
3.5k
•
101
ByteDance/Sa2VA-1B
Image-Text-to-Text
•
1B
•
Updated
Sep 8
•
848
•
29
HuggingFaceTB/SmolVLM-500M-Instruct
Image-Text-to-Text
•
0.5B
•
Updated
Apr 8
•
14k
•
182
Qwen/Qwen2.5-VL-72B-Instruct
Image-Text-to-Text
•
73B
•
Updated
Jun 6
•
234k
•
•
566
Qwen/Qwen2.5-VL-7B-Instruct
Image-Text-to-Text
•
8B
•
Updated
Apr 6
•
3.65M
•
•
1.37k
OpenGVLab/InternVideo2_5_Chat_8B
Video-Text-to-Text
•
8B
•
Updated
Aug 4
•
42.2k
•
85
nvidia/Eagle2-9B
Image-Text-to-Text
•
9B
•
Updated
Jan 28
•
350
•
61
stepfun-ai/GOT-OCR-2.0-hf
Image-Text-to-Text
•
0.6B
•
Updated
Jan 31
•
27.2k
•
219
allenai/olmOCR-7B-0225-preview
Image-to-Text
•
8B
•
Updated
Aug 19
•
6.13k
•
703
microsoft/Magma-8B
Image-Text-to-Text
•
9B
•
Updated
May 13
•
4.99k
•
411
marco/mcdse-2b-v1
2B
•
Updated
Oct 29, 2024
•
2.8k
•
56
CohereLabs/aya-vision-8b
Image-Text-to-Text
•
9B
•
Updated
Oct 30
•
44k
•
313
Skywork/Skywork-R1V-38B
Image-Text-to-Text
•
38B
•
Updated
Aug 12
•
50.8k
•
127
docling-project/SmolDocling-256M-preview
Image-Text-to-Text
•
0.3B
•
Updated
Sep 17
•
181k
•
1.6k
Qwen/Qwen2.5-VL-32B-Instruct
Image-Text-to-Text
•
33B
•
Updated
Apr 14
•
359k
•
•
468
reducto/RolmOCR
Image-to-Text
•
8B
•
Updated
Apr 2
•
9.46k
•
567
moonshotai/Kimi-VL-A3B-Thinking
Image-Text-to-Text
•
16B
•
Updated
Aug 18
•
39.9k
•
442
XiaomiMiMo/MiMo-VL-7B-RL
Image-Text-to-Text
•
8B
•
Updated
Jun 7
•
3.65k
•
165
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1
Image-Text-to-Text
•
Updated
Jun 13
•
907k
•
168
ByteDance/Dolphin
Image-Text-to-Text
•
0.4B
•
Updated
Jul 16
•
3.98k
•
507
nanonets/Nanonets-OCR-s
Image-Text-to-Text
•
4B
•
Updated
Jun 20
•
97.6k
•
1.56k
echo840/MonkeyOCR
Image-Text-to-Text
•
Updated
Aug 28
•
348
•
511
moonshotai/Kimi-VL-A3B-Thinking-2506
Image-Text-to-Text
•
16B
•
Updated
Aug 18
•
135k
•
323
prithivMLmods/DREX-062225-exp
Image-Text-to-Text
•
8B
•
Updated
Jul 20
•
31
•
5
zai-org/GLM-4.1V-9B-Thinking
Image-Text-to-Text
•
10B
•
Updated
Oct 25
•
384k
•
•
755
HelloKKMe/GTA1-72B
Image-to-Text
•
73B
•
Updated
Jul 8
•
25
•
4
rednote-hilab/dots.ocr
Image-Text-to-Text
•
3B
•
Updated
about 1 month ago
•
1M
•
1.14k
Upvote
-
Share collection
View history
Collection guide
Browse collections