--- license: apache-2.0 --- # Model Card: InternVL3_5-1B GGUF (Q4_0 Quantized) ## Quick Overview A compact 1.1B parameter multimodal AI model that understands both images and text. Quantized to 4-bit for efficiency. **Key Features:** - **Size:** ~484 MB (compressed from original 1.1B parameters) - **Capabilities:** Image description, visual QA, document understanding, basic reasoning - **Format:** GGUF (runs on CPU/GPU) - **License:** Apache 2.0 ## What It Does - Describe images and answer questions about them - Read text in images (OCR) - Understand documents and charts - Basic multimodal reasoning ## Quick Start Use with: - LM Studio - llama.cpp - Ollama - Any GGUF-compatible software ## Files - `InternVL3_5-1B-Q4_0.gguf` - Main model file - `mmproj-InternVL3_5-1B-Q4_0.gguf` - Multimodal projection file ## Limitations - Not for medical or safety-critical use - May make mistakes (verify important outputs) - Reduced precision due to compression ## Original Model - **Developer:** OpenGVLab - **Full Name:** InternVL3.5-1B - **Training:** Advanced multimodal training with reinforcement learning ## Repository Contents The repository contains the following files: 1. `.gitattributes` 2. `InternVL3_5-1B-Q4_0.gguf` 3. `README.md` (this file) 4. `mmproj-InternVL3_5-1B-Q4_0.gguf`