Kuroki Tomoko Piper voice model. A wild Tomoko appears! sound sample

A single-voice English Piper text-to-speech model trained against a HQ female pre-trained voice and about four minutes of the English dub voice from Watamote's Kuroki Tomoko character.

How to use with homeassistant via wyoming protocol and a separate wyoming-piper container:

create a directory for the files e.g. "piper-docker"
create directory "piper" and make sure it has rw for your docker user context.
create a docker-compose.yaml:
services:
  piper:
    container_name: piper
    image: rhasspy/wyoming-piper:latest
    command: --voice en_US-tomoko-high
    volumes:
      - ./piper:/data
      - ./voices.json:/usr/local/lib/python3.11/dist-packages/wyoming_piper/voices.json:ro
    restart: unless-stopped
    ports:
      - 10200:10200
    runtime: nvidia
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                count: 1
                capabilities: [gpu]

create piper-docker/voices.json:
{
    "en_US-tomoko-high": {
        "key": "en_US-tomoko-high",
        "name": "en_US-tomoko-high",
        "language": {
            "code": "en_US",
            "family": "en",
            "region": "US",
            "name_native": "English",
            "name_english": "English",
            "country_english": "US"
        },
        "quality": "high",
        "num_speakers": 1,
        "speaker_id_map": {},
        "files": {
            "en_US-tomoko-high.onnx": {
                "size_bytes": 114204023,
                "md5_digest": "4ef5706f34966ca020a449af4ec6cb66"
            },
            "en_US-tomoko-high.onnx.json": {
                "size_bytes": 7093,
                "md5_digest": "ef6424752cb3ab0c6cd1e1eeeea833db"
            },
            "MODEL_CARD": {
                "size_bytes": 0,
                "md5_digest": "d41d8cd98f00b204e9800998ecf8427e"
            }
        },
        "aliases": []
    }
}

put the voice files from this repo into piper-docker/piper

rename the files:
mv kuroki_tomoko.onnx en_US-tomoko-high.onnx
mv kuroki_tomoko.onnx.json en_US-tomoko-high.onnx.json

check the md5sums and file sizes (they might have changed due to me tweaking my local setup)
md5sum *tomoko*

then update piper-docker/voices.json if anything changed.

Run "docker compose up" and watch for errors. If there are none, go into homeassistant, open "settings, devices and services, wyoming protocol", add a service for piper using the IP where your container is running and port 10200.
Go to "settings, voice assistants, add assistant" (assuming you already have an LLM set up using the ollama add-on, I won't go into that here), and in the "text to speech" config dialog, "text to speech" should already be "piper" and "voice" should already be "en US-tomoko-high (high)"

TLDR - rhasspy/wyoming-piper is very specific about how it wants things named or it will refuse to use the voice files. PS - onnxruntime is a shitshow to build from source, I tried, I failed, I went back to the docker container PPS - despite the current version of the container being built for CUDA 12.1, it still worked under CUDA 13.0 via the nvidia docker container runtime.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support