view post Post 3327 Trained a model for emotion-controllable TTS based on MiMo audio on LAION's dataset.Still very early and does have an issue with hallucinating but results seem pretty good so far, given that it is very early into the training run.Will probably kick off a new run later with some settings tweaked.Put up a demo here: mrfakename/EmoAct-MiMo(Turn 🔊 on to hear audio samples) See translation 4 replies · 🔥 9 9 + Reply
view post Post 3374 What a fantastic community! See translation 1 reply · 🤗 10 10 ❤️ 6 6 🤝 1 1 + Reply
view post Post 1756 Glyph 🔥 a framework that scales context length by compressing text into images and processing them with vision–language models, released by Z.ai.Paper:https://huggingface.co/papers/2510.17800Model:https://huggingface.co/zai-org/Glyph✨ Compresses long sequences visually to bypass token limits✨ Reduces computational and memory costs✨ Preserves meaning through multimodal encoding✨ Built on GLM-4.1V-9B-Base See translation 🔥 3 3 + Reply