DiβͺβͺRhythm
Blazingly Fast and Embarrassingly Simple Song Generation
Blazingly Fast and Embarrassingly Simple Song Generation
Chatterbox TTS supporting 23 languages
Analyze images to detect objects, points, keypoints, or text
Controllable emotional/voice-acting TTS (now with v1.1)
Generate Vietnamese speech from text
Easily remove your videos background!
Translate text between 200 languages
MegaTTS 3 but with voice cloning!
Generate a video by interpolating between two images with a prompt
Voice conversion framework based on VITS
Launch a customizable user interface
Generate captions for images in various styles
Conversational speech generation
Generate videos from images and prompts
Speedy and Accurate Image to 3D Generator
SeedVR2-3B Image & Video API Demo
Generate podcast and tiktok style video avatars
Test Qwen Image Edit Lora :)
State-of-the-art music analysis with multi-scale datasets
Anime-Llasa-3B-Captions-Demo
Spatial reasoning with vision-language models
Find matching images based on input criteria
Launch a web interface for model interaction
Generate a video from a single image