---
license: apache-2.0
---
# Video-BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Generation
[📖 Paper](https://tacossp.github.io/BLADE-Homepage/) | [🚀 Homepage](https://www.google.com/search?q=%23-quick-start) | [💾 Models](https://huggingface.co/GYP666/VIDEO-BLADE) | [📖 中文阅读](README_zh.md)
Video-BLADE is a data-free framework for efficient video generation. By jointly training an adaptive sparse attention mechanism with a step distillation technique, it achieves a significant acceleration in video generation models. This project combines a block-sparse attention mechanism with step distillation, reducing the number of inference steps from 50 to just 8 while maintaining high-quality generation.
## 📢 News
- **[Aug 2024]** 🎉 The code and pre-trained models for Video-BLADE have been released\!
- **[Aug 2024]** 📝 Support for two mainstream video generation models, CogVideoX-5B and WanX-1.3B, is now available.
- **[Aug 2024]** ⚡ Achieved high-quality video generation in just 8 steps, a significant speedup compared to the 50-step baseline.
## ✨ Key Features
- 🚀 **Efficient Inference**: Reduces the number of inference steps from 50 to 8 while preserving generation quality.
- 🎯 **Adaptive Sparse Attention**: Employs a block-sparse attention mechanism to significantly reduce computational complexity.
- 📈 **Step Distillation**: Utilizes the Trajectory Distillation Method (TDM), enabling training without the need for video data.
- 🎮 **Plug-and-Play**: Supports CogVideoX-5B and WanX-1.3B models without requiring modifications to their original architectures.
## 🛠️ Environment Setup
### System Requirements
- Python \>= 3.11 (Recommended)
- CUDA \>= 11.6 (Recommended)
- GPU Memory \>= 24GB (for Inference)
- GPU Memory \>= 80GB (for Training)
### Installation Steps
1. **Clone the repository**
```bash
git clone https://github.com/Tacossp/VIDEO-BLADE
cd VIDEO-BLADE
```
2. **Install dependencies**
```bash
# Install using uv (Recommended)
uv pip install -r requirements.txt
# Or use pip
pip install -r requirements.txt
```
3. **Compile the Block-Sparse-Attention library**
```bash
git clone https://github.com/mit-han-lab/Block-Sparse-Attention.git
cd Block-Sparse-Attention
pip install packaging
pip install ninja
python setup.py install
cd ..
```
## 📥 Model Weights Download
### Base Model Weights
Please download the following base model weights and place them in the specified directories:
1. **CogVideoX-5B Model**
```bash
# Download from Hugging Face
git lfs install
git clone https://huggingface.co/zai-org/CogVideoX-5b cogvideox/CogVideoX-5b
```
2. **WanX-1.3B Model**
```bash
# Download from Hugging Face
git clone https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B-Diffusers wanx/wan1.3b
```
### Pre-trained Video-BLADE Weights
We provide pre-trained weights for Video-BLADE:
```bash
# Download pre-trained weights
git clone https://huggingface.co/GYP666/VIDEO-BLADE pretrained_weights
```
### Weight Directory Structure
Ensure your directory structure for weights is as follows:
```
VIDEO-BLADE/
├── cogvideox/
│ └── CogVideoX-5b/ # Base model weights for CogVideoX
├── wanx/
│ └── wan1.3b/ # Base model weights for WanX
└── pretrained_weights/ # Pre-trained weights for Video-BLADE
├── BLADE_cogvideox_weight/
└── BLADE_wanx_weight/
```
## 🚀 Quick Start - Inference
### CogVideoX Inference
```bash
cd cogvideox
python train/inference.py \
--lora_path ../pretrained_weights/cogvideox_checkpoints/your_checkpoint \
--gpu 0
```
**Argument Descriptions**:
- `--lora_path`: Path to the LoRA weights file.
- `--gpu`: The ID of the GPU device to use (Default: 0).
**Output**: The generated videos will be saved in the `cogvideox/outputs/inference/` directory.
### WanX Inference
```bash
cd wanx
python train/inference.py \
--lora_path ../pretrained_weights/wanx_checkpoints/your_checkpoint \
--gpu 0
```
**Output**: The generated videos will be saved in the `wanx/outputs/` directory.
## 📊 Project Structure
```
VIDEO-BLADE/
├── README.md # Project documentation
├── requirements.txt # List of Python dependencies
│
├── cogvideox/ # Code related to CogVideoX
│ ├── CogVideoX-5b/ # Directory for base model weights
│ ├── train/ # Training scripts
│ │ ├── inference.py # Inference script
│ │ ├── train_cogvideo_tdm.py # Training script
│ │ ├── train_tdm_1.sh # Script to launch training
│ │ ├── modify_cogvideo.py # Model modification script
│ │ └── config.yaml # Training configuration file
│ ├── prompts/ # Preprocessed prompts and embeddings
│ └── outputs/ # Output from training and inference
│
├── wanx/ # Code related to WanX
│ ├── wan1.3b/ # Directory for base model weights
│ ├── train/ # Training scripts
│ │ ├── inference.py # Inference script
│ │ ├── train_wanx_tdm.py # Training script
│ │ ├── train_wanx_tdm.sh # Script to launch training
│ │ └── modify_wan.py # Model modification script
│ ├── prompts/ # Preprocessed prompts and embeddings
│ └── outputs/ # Output from training and inference
│
├── utils/ # Utility scripts
│ ├── process_prompts_cogvideox.py # Data preprocessing for CogVideoX
│ ├── process_prompts_wanx.py # Data preprocessing for WanX
│ └── all_dimension_aug_wanx.txt # Training prompts for WanX
│
├── Block-Sparse-Attention/ # Sparse attention library
│ ├── setup.py # Compilation and installation script
│ ├── block_sparse_attn/ # Core library code
│ └── README.md # Library usage instructions
│
└── ds_config.json # DeepSpeed configuration file
```
## 🤝 Acknowledgements
- [FlashAttention](https://github.com/Dao-AILab/flash-attention), [Block-Sparse-Attention](https://github.com/mit-han-lab/Block-Sparse-Attention): For the foundational work on sparse attention.
- [CogVideoX](https://github.com/THUDM/CogVideo), [Wan2.1](https://github.com/Wan-Video/Wan2.1): For the supported models.
- [TDM](https://www.google.com/search?q=https://github.com/Luo-Yihong/TDM): For the foundational work on distillation implementation.
- [Diffusers](https://github.com/huggingface/diffusers): For the invaluable diffusion models library.
## 📄 Citation
If you use Video-BLADE in your research, please cite our work:
```bibtex
@article{video-blade-2024,
title={Video-BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Generation},
author={},
year={2024}
}
```
## 📧 Contact
For any questions or suggestions, feel free to:
- Contact Youping Gu at youpgu71@gmail.com.
- Submit an issue on our [Github page](https://github.com/Tacossp/VIDEO-BLADE/issues).