Hacker News new | past | comments | ask | show | jobs | submit login

Love the idea.

It also shows that GPT-4V created a new angle in web scraping.

I guess, this or similar code would be leveraged in many projects like:

1. Scrape XXX websites, say LinkedIn or Twitter use all types of methods in the DOM to prevent it, but fighting working well GPT-4V + OCR would be ultra hard.

2. Give me an analysis of what these XXX companies are doing. And this could be done for competitors, to understand the landscape of some industry, or even plainly to get news.

Large-scale scrapping, not depending on the source code of the pages is a powerful infrastructural change.




It took me a while to get what you meant, because... I'm not sure "XXX websites" usually means what you intended to convey here :)


I feel very innocent now, as it did not even cross my mind ;)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: