Audio Samples
The following samples are generated with a DNN-based architecture similar to Tacotron-GST. The architecture also includes speaker ids in the training step. At synthesis time, the speaker id is also provided.
The speakers are a subset from the SWARA Corpus.
| Natural | Sample 1 | Sample 2 | |
|---|---|---|---|
| Speaker 1 | |||
| Speaker 2 | |||
| Speaker 3 | |||
| Speaker 4 | |||
| Speaker 5 | |||
| Speaker 6 | |||
| Speaker 7 | |||
| Speaker 8 | |||
| Speaker 9 |