Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
csuhan 's Collections
Tar
OneLLM

Tar

updated Sep 20

[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Upvote
1

  • Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

    Paper • 2506.18898 • Published Jun 23 • 33

  • Running on Zero
    47
    47

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • Running on Zero
    3
    3

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • Sleeping
    60
    60

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • ByteDance-Seed/Tar-1.5B

    Any-to-Any • 3B • Updated Jul 2 • 126 • 20

  • ByteDance-Seed/Tar-7B

    Any-to-Any • 9B • Updated Jul 2 • 338 • 38

  • ByteDance-Seed/Tar-TA-Tok

    Updated Jul 2 • 6

  • csuhan/Tar-SANA-600M-512px

    Text-to-Image • Updated Aug 15 • 2

  • csuhan/Tar-SANA-600M-1024px

    Text-to-Image • Updated Aug 15

  • csuhan/Tar-Lumina2

    Updated Sep 1

  • csuhan/tar_1.5B_pretrain_demo

    3B • Updated Jun 16 • 4 • 1
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs