bytedance-research/HuMo
Image-to-Video
•
Updated
•
209
•
245
UMO based on OmniGen2
inpaint images using Qwen Image with inpainting Controlnet
Chat with a powerful language model
Detect objects in images and videos
Convert audio to text with context and language options
Generate images from text prompts