Shows an illustrated sun in light mode and a moon with stars in dark mode.

KVAE-3D 1.0: Video tokenizer

KVAE-3D model has time compression 4, spacial compression 8x8 and 16 latent channels

Evaluation results

Reconstructions comparison of KVAE-3D and Hunyuan:

kvae3d_comparison

Evaluation results of KVAE-3D model on MCL-JCV dataset. All compared models perform 4x8x8 compression with 16 latent channels:

Model PSNR SSIM LPIPS
Wan-2.1 33.75 0.90 0.089
HunyuanVideo 33.91 0.91 0.103
KVAE-3D 35.63 0.92 0.088
Downloads last month
57
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including kandinskylab/KVAE-3D-1.0