I replaced the model as required, but encountered other errors as follows:

by pkqbszs - opened Nov 4

Nov 4

Traceback(mostrecent calllast):
File "",line 4,in
File "",line 744,in
File "",line 448,in main
File "/root/index-tts/indextts/infer_v2.py",line 83,in init
load checkpoint(self.gpt, self.gpt path)
File "/root/index-tts/indextts/utils/checkpoint.py", line 28, in load checkpoint
model.load state dict(checkpoint,strict=True)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2593, in load state dict
raise RuntimeError(
RuntimeError: Error(s)in loading state dict for UnifiedVoice:
size mismatch for text embedding.weight: copying a param with shape torch.size([6801, 1288]) from checkpoint, the shape in current
odel is torch.size([12001，1280]).
size mismatch for text head.weight: copying a param with shape torch.size([6081, 1280]) from checkpoint, the shape in current model
size mismatch for text head.bias: copying a param with shape torch,size([6081]) from checkpoint, the shape in current model is torc
h.size([120011).
is torch.size([12001，12801).

pkqbszs

Nov 4

Would you mind answering this question if it's convenient for you?

boinjj

Nov 8

•

edited Nov 8

apply a patch to your checkpoints/config.yaml to fix
gpt:
number_text_tokens: 6000

pkqbszs

Nov 10

apply a patch to your checkpoints/config.yaml to fix
gpt:
number_text_tokens: 6000

thanks for your advice
What do you think of this model's performance? I don't feel like it's working very well.

boinjj

Nov 11

apply a patch to your checkpoints/config.yaml to fix
gpt:
number_text_tokens: 6000

thanks for your advice
What do you think of this model's performance? I don't feel like it's working very well.

Well , not working so far. It just syth random clips like the japanese accent but not japanese words at all.

WariHima changed discussion status to closed Nov 13

WariHima

Owner Nov 13

I'm sorry. this model input aquestalk stayle kana only,
japanese ゆっくり実況 style accent lang,
i released index-tts2 fork.
automatic segment text and kanji to aquestalk stayle,
i released finally weight and code, how use in modelcard,

WariHima changed discussion status to open Nov 13

WariHima

Owner Nov 13

sorry.
old and newer pretrain model for finetuning,
very small length difficult create voice in かな漢字 text under 10 chracter
few-shot was broken. but my fork index-tts2 include easy create dataset gui (from gpt-so-vits),

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment