Falcon-8B

Description

[Paper] [GitHub] [Project Page]

This is the official model weights of FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers. In this work, we propose the FALCON model, which introduces a novel visual register technique to simultaneously address the issues of visual redundancy and fragmentation in the high-resolution visual encoding of MLLMs.

image/png

How to Run?

Please refer to the instructions in the Githhub repository.

Citation

If you find this work useful for your research, please kindly cite our paper:

@InProceedings{zhang2025falcon,
    author={Zhang, Renshan and Shao, Rui and Chen, Gongwei and Zhang, Miao and Zhou, Kaiwen and Guan, Weili and Nie, Liqiang},
    title={FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers},
    booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month= {October},
    year={2025},
}
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for renns/Falcon-8B

Finetuned
(1)
this model