Hacker News new | past | comments | ask | show | jobs | submit login

Instead of the Reddit corpus you may just as well use a picture library of human footprints. It would be no more optimistic.

Human speech is produced from the conscious experience of being a human being. If your dataset contains just the speech, without the experience, there's simply not enough there. Any machine trained on this data is doomed to talk hollow rubbish.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: