And yet, anybody who actually works in machine learning has repeatedly said "fig... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

groby_b on Nov 16, 2017 | parent | context | favorite | on: Startup Ideas

And yet, anybody who actually works in machine learning has repeatedly said "figure out what problem you're solving first, then determine what data you might need - don't just throw a classifier at a data garbage heap"

And the experience doesn't show more data is better, it shows that an excess of features leads to overfitting.

yorwba on Nov 17, 2017 [–]

Sure, having more features gives a model more opportunities to overfit. But having more data points has the opposite effect, since they reflect the underlying distribution better and therefore provide a better estimate of actual model performance.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact