svi-model / README.md

vita-video-gen

Improve model card: Add pipeline tag, library name, license, and update paper/project links (#6)

3513c73 verified 16 days ago

preview code

raw

history blame

5.07 kB

metadata

datasets:
  - vita-video-gen/svi-benchmark
language:
  - en
tags:
  - video generation
pipeline_tag: image-to-video
library_name: diffusers
license: mit
project_page: https://stable-video-infinity.github.io/homepage/
papers:
  - title: >-
      Stable Video Infinity: Infinite-Length Video Generation with Error
      Recycling
    authors:
      - Wuyang Li
      - Wentao Pan
      - Po-Chien Luan
      - Yang Gao
      - Alexandre Alahi
    url: https://huggingface.co/papers/2510.09212
    conference: arXiv preprint, 2025

Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

🎯 About This Repository

Stable-Video-Infinity(SVI) is able to generate ANY-length videos with high temporal consistency, plausible scene transitions, and controllable streaming storylines in ANY domains. This repository contains the model weights of SVI Family.

🌟 Key Highlights

OpenSVI: Everything is open-sourced: training & evaluation scripts, datasets, and more.
Infinite Length: No inherent limit on video duration; generate arbitrarily long stories (see the 10‑minute “Tom and Jerry” demo).
Versatile: Supports diverse in-the-wild generation tasks: multi-scene short films, single‑scene animations, skeleton-/audio-conditioned generation, cartoons, and more.
Efficient: Only LoRA adapters are tuned, requiring very little training data: anyone can make their own SVI easily.

📦 Resources

Model	Task	Input	Output	Hugging Face Link	Comments
ALL	Infinite possibility	Image + X	X video	🤗 Folder	Family bucket! I want to play with all!
SVI-Shot	Single-scene generation	Image + Text prompt	Long video	🤗 Model	Generate consistent long video with 1 text prompt. (This will never drift)
SVI-Film	Multi-scene generation	Image + Text prompt stream	Film-style video	🤗 Model	Generate creative long video with 1 text prompt stream (5 second per text).
SVI-Film (Transition)	Multi-scene generation	Image + Text prompt stream	Film-style video	🤗 Model	Generate creative long video with 1 text prompt stream. (More scene transitions due to the training data)
SVI-Tom&Jerry	Cartoon animation	Image	Cartoon video	🤗 Model	Generate creative long cartoon videos with 1 text prompt stream (This will never drift in our 20 min test)
SVI-Talk	Talking head	Image + Audio	Talking video	🤗 Model	Generate long videos with audio-conditioned human speaking
SVI-Dance	Dancing animation	Image + Skeleton	Dance video	🤗 Model	Generate long videos with skeleton-conditioned human dancing

Note: If you want to play with T2V, you can directly use SVI with an image generated by any T2I model!

📝 Citation

If you find our work helpful for your research, please consider citing our paper. Thank you so much!

@article{li2025stable,
      title={Stable Video Infinity: Infinite-Length Video Generation with Error Recycling}, 
      author={Wuyang Li and Wentao Pan and Po-Chien Luan and Yang Gao and Alexandre Alahi},
      journal={arXiv preprint arXiv: arXiv:2510.09212},
      year={2025},
      url={https://huggingface.co/papers/2510.09212},
}