Hacker News new | past | comments | ask | show | jobs | submit login

Works great until you stumble on one of the big sites who will auto-ban you for not having a valid Google IP address.



* Reverse DNS records. Webmasters shouldn't be verifying Google's bots by hard coding IP addresses.

https://support.google.com/webmasters/answer/80553?hl=en


Shouldn't != never happens


CIDR blocks and ASN advertisments are cheap.

Update those periodically (hours / days / weeks). The adverts don't change particularly quickly.


It's extremely rare to be ip-blocked by any website just for using the Google's user agent from a non-specific range. IP's get re-used and you can switch to a new one easily, so it's really not common or good practice for this to happen.


> IP's get re-used and you can switch to a new one easily, so it's really not common or good practice for this to happen.

On the flip side, some people can't change their IP addresses easily, and getting IP banned (even if rare because of the reasons you stated) is actually a major hassle when it actually happens for those people. :/


Is that really a thing? That must be such a hazard for their developers. I usually have a test for sites that I work on, that scrapes a few URLs as Googlebot, to verify that they are getting an optimized view (no JS, structural-only css).


God. That's the reason so many sites look great in the results and are confusing interactive messes when I get into them.


If it's any consolation, the site is well tested in Noscript mode.


You sir are a god among Web developers.

If only they all did this. So many sites I get to and they're a blank page or an absolute disaster....


Yes. Googlebot only crawls from legit addresses (even when their developers are trying new things) so it's an easy scraper/scammer signal to key off of.


No. Most websites don't do this.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: