Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Hanyu66 's Collections
3d
AwesomeLLMs

AwesomeLLMs

updated 7 days ago
Upvote
-

  • Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts

    Paper • 2309.15915 • Published Sep 27, 2023 • 2

  • Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants

    Paper • 2310.00653 • Published Oct 1, 2023 • 3

  • Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

    Paper • 2308.12966 • Published Aug 24, 2023 • 11

  • An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

    Paper • 2309.09958 • Published Sep 18, 2023 • 19

  • Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

    Paper • 2308.13437 • Published Aug 25, 2023 • 4

  • DeepEyesV2: Toward Agentic Multimodal Model

    Paper • 2511.05271 • Published 19 days ago • 41

  • Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings

    Paper • 2511.05017 • Published 20 days ago • 7

  • Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

    Paper • 2511.04962 • Published 20 days ago • 51

  • Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

    Paper • 2511.04570 • Published 20 days ago • 202
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs