Audio Samples
The following samples are generated with a DNN-based architecture similar to Tacotron-GST. The architecture also includes speaker ids in the training step. At synthesis time, the speaker id is also provided.
The speakers are a subset from the SWARA Corpus.
Natural | Sample 1 | Sample 2 | |
---|---|---|---|
Speaker 1 | |||
Speaker 2 | |||
Speaker 3 | |||
Speaker 4 | |||
Speaker 5 | |||
Speaker 6 | |||
Speaker 7 | |||
Speaker 8 | |||
Speaker 9 |