Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My reading of the generator diagram (figure 6) isn't that it is generating waveforms, but that it is generating phoneme probabilities.

You can train a similar system to produce audio on the output of wav2vec, though it probably won't sound similar to the input audio (accent/voice) unless you expose more features of the input than phonemes.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: