Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
fnauman 's Collections
timeseries
edge
llm-demos
multimodal
reasoning
embedding

multimodal

updated 6 days ago
Upvote
-

  • vikhyatk/moondream2

    Image-Text-to-Text • 2B • Updated Sep 23 • 1.54M • 1.34k

  • Qwen/Qwen2.5-VL-7B-Instruct

    Image-Text-to-Text • 8B • Updated Apr 6 • 4.22M • • 1.36k

  • google/gemma-3-27b-it-qat-q4_0-gguf

    Image-Text-to-Text • 27B • Updated Apr 11 • 7.12k • 360

  • google/paligemma2-3b-mix-224

    Image-Text-to-Text • 3B • Updated Feb 7 • 8.84k • 39

  • HuggingFaceTB/SmolVLM2-256M-Video-Instruct

    Image-Text-to-Text • 0.3B • Updated Apr 8 • 19.6k • 83

  • unsloth/Qwen2.5-VL-3B-Instruct-GGUF

    Image-Text-to-Text • 3B • Updated May 12 • 7.52k • 17

  • OpenGVLab/InternVL3-1B

    Image-Text-to-Text • 0.9B • Updated Sep 11 • 92.8k • 75

  • BLIP3o/BLIP3o-Model-8B

    14B • Updated Jun 4 • 1.04k • 102

  • FastVLM: Efficient Vision Encoding for Vision Language Models

    Paper • 2412.13303 • Published Dec 17, 2024 • 72

  • jinaai/jina-clip-v2

    Feature Extraction • 0.9B • Updated Apr 28 • 334k • 290

  • Qwen/Qwen3-VL-4B-Instruct

    Image-Text-to-Text • 4B • Updated Oct 15 • 707k • 239

  • PaddlePaddle/PaddleOCR-VL

    Image-Text-to-Text • 1.0B • Updated 12 days ago • 31.6k • 1.36k

  • PerceptronAI/Isaac-0.1

    Text Generation • 3B • Updated Oct 9 • 3.27k • 108

  • moondream/refcoco-m

    Viewer • Updated 8 days ago • 1.19k • 35.9k • 43
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs