Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Although you'd imagine screenshots would be easy to OCR reliably, it's not guaranteed to get everything correct.

It's not like you can rely on a dictionary to confirm you've correctly OCRed a post by "@4EyedJediO" - who knows if that's an O or a 0 at the end?

And if you're OCRing the title and view count of a youtube video, for example, you've got to take the page layout into account because there's a recommendations sidebar full of other titles with different view counts.



I guess you'd get better results if you knew the font the site uses (which in many cases you could figure it out pretty quickly) or even just override every font with your own.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: