Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation Paper • 2411.19331 • Published Nov 28, 2024 • 5
view article Article AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models Sep 16 • 18
YanoljaNEXT-Rosetta Collection Translation Model for JSON-Structured Data • 3 items • Updated Sep 3 • 8
MobileCLIP2 Collection MobileCLIP2: Mobile-friendly image-text models with SOTA zero-shot capabilities trained on DFNDR-2B • 37 items • Updated Sep 18 • 56
FastVLM Collection Efficient Vision Encoding for Vision Language Models • 9 items • Updated Sep 2 • 103
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 380