I did! I tried creating a multi-speaker embedding model for practical concerns: ...

echelon on July 27, 2020 | parent | context | favorite | on: Show HN: Neural text to speech with dozens of cele...

I did! I tried creating a multi-speaker embedding model for practical concerns: saving on memory costs. I'm going to have to add additional layers, because it didn't fit individual speakers very well. I wish I'd saved audio results to share. I might be able to publish my findings if I look around for the model files.

I think you're right in that if we can get such a model to work, training new embeddings won't require much data.

webmaven on July 28, 2020 [–]

Hmm. Would a multi-speaker model be able to interpolate between voices (eg. halfway between Morgan Freeman and James Earl Jones)?