Open
Description
Hi all,
I create zero-shot synthetic speech and it is not even close to the reference speaker voice (sometimes different gender)
I use tts.synthesise function to pass the name of the reference speaker file and the produces audio different for different reference speakers never similar to the target.
Any idea what can be wrong?
Just for reference - I use more than 1 min of audio from Multilingual LibriSpeech database (english part)
Metadata
Metadata
Assignees
Labels
No labels