Then search engine crawlers should get paywalled too. The motivation is correct....

mdoms · on Feb 14, 2022

I'm not following your logic here. The Economist wants to charge readers because they produce high quality content that's worth paying for (in their estimation). This seems entirely orthogonal to whether they blacklist a web crawler - a crawler they didn't even ask for, which would be all over their website whether they want it or not.

I think you're confused because the crawler and the browser both use the same channel and the same protocols to access the information (the website over HTTP). But that's just a detail. Google could send them a hand written form for them to fill out with details of each of their articles and some thumbnail images to be manually entered into a Google database for all we care.

lmm · on Feb 14, 2022

> I think you're confused because the crawler and the browser both use the same channel and the same protocols to access the information (the website over HTTP). But that's just a detail. Google could send them a hand written form for them to fill out with details of each of their articles and some thumbnail images to be manually entered into a Google database for all we care.

I disagree; I care quite a bit about whether the Google results are about the actual page I'm going to see or about what the page author claimed the page would be about. (Indeed I'm old enough to remember that what originally set Google apart from competing search engines was that it would ignore the meta keyword tags that authors used to describe their pages, in favour of indexing the visible page content directly)

IanCal · on Feb 14, 2022

And surely we'd want high quality content returned from a search engine. If Google never returned results where the company wanted me to buy something it'd be pretty sparse

mdoms · on Feb 14, 2022

Seems like you should take that up with Google? Why should The Economist be obligated to serve no content to a crawler just because they want to charge readers a fair price for their content?

md224 · on Feb 14, 2022

I think the person you're replying to is agreeing with you.

Karunamon · on Feb 14, 2022

You're a couple lines of robots.txt from not having your site appear in search engines most people use. Meanwhile, putting something up on the open internet includes the risk that people and robots will see it.

aserdf · on Feb 14, 2022

IMO the issue is the paywall is essentially "cloaking" by google webmaster standards. different content is displayed to the crawler (actual text of the article which gets indexed) vs the user (a paywall).

the content provider might not ask for the crawler but they are certainly catering to it - and benefitting from it.

jsmeaton · on Feb 14, 2022

I’m not confused and I don’t really understand why you think I am.

I believe the content that is indexed is the content you can see. Sites used to be penalised, heavily, for returning different content to google. Hiding the paywall for google falls into that bucket.

At a minimum the search results should display if they’re paywalled and provide tools to exclude that content from results.