HTTP 200 = "Cloudflare, please cache this status message instead of passing through a million requests to our dead server while it's busy restoring a backup".
PG doesn't care about HN's search listings, so there's no drawbacks to doing that.
> "Probably because we restrict their crawlers. But this is an excellent side effect, because the last thing I want is traffic from Google searches."
Am I the only one who finds that puzzling?
This isn't Fight Club. It's not even Entrepreneur Club. It's a bunch of generally smart people talking about technology, with an emphasis on making money from it. It's one of my favorite sites, and I love it, but it's not an invitation-only club, is it?
(I also find it weird that one of the go-to sites for web-savvy people would be like, "yeah, screw status codes and how the open, linked, web is supposed to work".)
To be clear, I'm not Protesting a Great Evil. I just find it puzzling, as in, "That's odd, I must not understand what this is all about, after all."
I can't speak for PG, but I think the general idea is that a slow influx of new users is less likely to alter the nature of HN as everyone has a chance to acclimatize (avoiding some sort of Eternal September), and the people who really "need" to be on HN (people interested in startups I guess?) will know about HN already, or be told about it. That last part might be a little "fightclub-ish" I guess, but it seems to be working alright.
Couldn't he just turn off registrations for new accounts? Not saying he needs to get HN to the top for a "startup" search query. I found HN by a Google search while looking for a good laptop to run Linux on.
That happens. When too many people register accounts they're locked for the rest of the day and the register account option is no longer there, only login.
IIRC (I might search for it later) that was a spambot fix. Apparently it was fairly effective - I presume the bots were smart enough to find the 'login' link on the front page then register an account from there but not much else.
It's not just puzzling, it's almost criminal considering how much of the culture and important decisions get discussed here.
A lot of the time you have people who are direct parties to <insert thing here> come and talk about it only for it to become forever inaccessible because Google can't get it's mittens on it.
Let's not even get started on how some urls expire.
Not really. There were plenty of times when I tried to find an article a few days back but google was coming up blank even with `site:news.ycombinator.com`. Had to resort to scrolling through HN's facebook bot page (posts all the front-paged links)
How does that make sense? As if CloudFlare would honor status codes but not take advantage of cache headers (which in this case stipulated an absurd 10 year expiration).
I seriously doubt Cloudflare's behaviour would be that stupid, wouldn't it momentarily cache error pages instead of hammering the server? At a minimum it would throttle/prevent concurrent requests.
Unfortunately, it took me 12 hours to find out the site was back up because the outage page had been cached for me. I had to realize I had to do a hard-refresh of the page.
PG doesn't care about HN's search listings, so there's no drawbacks to doing that.