Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Gedas Bertasius's picture
3 3

Gedas Bertasius

gberta
https://www.gedasbertasius.com/
  • gberta227
  • gberta

AI & ML interests

None yet

Organizations

None yet

authored a paper 8 months ago

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

Paper • 2503.09590 • Published Mar 12 • 3
authored a paper about 1 year ago

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Paper • 2409.07450 • Published Sep 11, 2024 • 11
authored 6 papers over 1 year ago

Is Space-Time Attention All You Need for Video Understanding?

Paper • 2102.05095 • Published Feb 9, 2021 • 2

SimpleClick: Interactive Image Segmentation with Simple Vision Transformers

Paper • 2210.11006 • Published Oct 20, 2022

Unified Coarse-to-Fine Alignment for Video-Text Retrieval

Paper • 2309.10091 • Published Sep 18, 2023

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Paper • 2101.12059 • Published Jan 28, 2021

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

Paper • 2401.10529 • Published Jan 19, 2024 • 1

Video ReCap: Recursive Captioning of Hour-Long Videos

Paper • 2402.13250 • Published Feb 20, 2024 • 26
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs