Hours of audio comprised of clean spoken sentences, zero noise, and uniform micr... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

echelon on Dec 28, 2021 | parent | context | favorite | on: Mocking Bird – Realtime Voice Clone for Chinese

Hours of audio comprised of clean spoken sentences, zero noise, and uniform microphone quality is ideal.

Some of the predominant base data sets used for transfer learning, such as LJSpeech [1], are unfortunately noisy and non-uniform.

[1] https://keithito.com/LJ-Speech-Dataset/

kragen on Dec 29, 2021 [–]

Thank you very much!

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact