Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ruochenx 's Collections
Sidewalk
Multimodal Dataset COT
OCR
Multimodal LLM
DPO dataset
Chinese OCR

Multimodal LLM

updated Aug 23, 2024
Upvote
-

  • MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

    Paper • 2408.02718 • Published Aug 5, 2024 • 62

  • LLaVA-OneVision: Easy Visual Task Transfer

    Paper • 2408.03326 • Published Aug 6, 2024 • 61
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs