Introducing the Invisible reCAPTCHA

macrael · on March 9, 2017

Google using captchas to get humans to read street addresses captured by street view cars to improve maps results remains one of the most Googly things they've ever done. Genius, lateral, and a little weird.

new299 · on March 9, 2017

Right, but it was really just an extension of the reCAPTCHA technology. Originally applied only to books.

reCAPTCHA was a technology acquired by Google. Not developed in house.

https://en.wikipedia.org/wiki/ReCAPTCHA

meowface · on March 10, 2017

The founder, Luis von Ahn, later went on to found Duolingo. Duolingo uses a similar "twofer" crowdsourcing strategy.

mrighele · on March 9, 2017

Having to do free work for Google is more than weird for me. For this reason unless the page is really important I end up closing the tab. If it was for open service (such as OpenStreetMap) it would be a different matter though.

danso · on March 9, 2017

How is that "free work"? Automated usage can threaten a site's bottom-line, and even its existence. A site resorts to Google's CAPTCHA's service because they don't have the money to build their own detector. It's not a service that was free for Google to create.

edit: Also worth noting that some of this "free work" that users do for Google is used to improve bot-detection overall. The comment to this blog post (the post itself, and its paper are great reads) is a nice example:

https://security.googleblog.com/2013/10/recaptcha-just-got-e...

Essentially, the user is complaining (and justifiably so) of being served the pre-Street-View versions of CAPTCHA; I had forgotten how bad they could get: http://i.imgur.com/01F2eES.png

dsp1234 · on March 9, 2017

How is that "free work"?

The original recaptcha was used to help clear up text from book digitization projects where OCR couldn't understand the data. The appeal of this was that these works were in the public domain, and thus proper digitization of these works is a societal-wide benefit.

Since Google purchased it, they've been using it to:

1.) Clean street view data for a proprietary product (Google maps)

2.) Build training sets for unknown ML purposes

These are activities that Google could very much pay a group of people to do. Instead, through recaptcha, they are getting that work from the end user for no payment. A case could be made that it's not free for the owner of the site that deploy recaptcha (because they get value out of the service, and Google gets data/ML services). However, the actual end user who has to fill out the recaptcha does not benefit in any significant way. Since a recaptcha is an inconvenience to the end user, that user pays both with the time to fill it out, and the data gathered by Google.

TL;DR Some people do not like that Google benefits from a transaction where Google is not a party, and where otherwise, Google could generate the benefit using their own resources.

danso · on March 9, 2017

Recaptcha was not a free service. Or if it was, why did Google end up paying millions for it? And it's not free now, unless the engineers and scientists working on it are working pro bono.

According to Wikipedia and the New York Times, reCaptcha was not developed for public domain works. Its pilot project was to digitize the NYT archives, archives which were not released to the public domain nor are fully available without being a subscriber: http://www.nytimes.com/2011/03/29/science/29recaptcha.html

I'm not a machine learning expert but I'm going to laugh at your suggestion that Google could "pay a group of people to do". In the above referenced NYT article from 2011, recaptcha's creator says several million words were being processed by recaptcha per day.

And again, have to disagree that the end user "does not benefit in any significant way". We would not be discussing this if Google hadn't learned from massive user data to iterate their captcha from distorted word mush to what it is today. Captcha was a serious drain of user energy and patience, that's why recaptcha was invented in the first place. And the worst captchas were tolerated because automated usage was a financial threat to websites that end users use.

eriknstr · on March 9, 2017

I had totally forgotten how bad the word mush was. In perspective of this, I agree that we do get something back from doing unpaid work for Google, and that is a HUGELY improved user experience with the current reCAPTCHA like "click all the pictures of sandwiches" or "click the portions of the picture that has a street sign in it" compared to the earlier reCAPTCHA and other CAPTCHA.

wst_ · on March 10, 2017

Which still makes you work for them. Just, instead of letters, you get a bunch of images to recognize.

dsp1234 · on March 9, 2017

Recaptcha was not a free service.

It was provided a free service[0] (search free). I'm pretty sure that no one in this conversation is saying that the cost to run the service is free, and arguing such is a strawman.

was not developed for public domain works.

pilot project was to digitize the NYT archives

While the original "recaptcha.net" website is no longer available online (even via archive.org). There are plenty of sources still available that bely this claim. that will help convert printed text into computer-readable letters on behalf of the Internet Archive[0]. The team is involved in digitising old books and manuscripts supplied by a non-profit organisation called the Internet Archive[1]. "There's still about 100 million books to be digitised, which at the current rate will take us about 400 years to complete - Luis von Ahn, Carnegie Mellon"[1]. The sources are from 2007, when recaptcha was first introduced. There are more, but I picked the first two by going back to the beginning of the wikipedia entry for recaptcha and looking at the supplied sources.

This is before Google bought it in 2009. While I can't speak for anyone else, this is what I mean when I say "original" recaptcha.

laugh at your suggestion that Google could "pay a group of people to do".

They got a group of people to do it for free, which implies (barring salaries) that they could also get a group of people to do it for not-free. Thus the argument that people don't want to work for Google for free.

Google hadn't learned from massive user data

And when Google has met it's ML goals and decides it gets no further benefit from recaptcha? Remember, it's not a free-to-run service. They only continue it as long as they benefit. If recaptcha shuts down, it would have been nicer to have the fruits of that work available through something like the Internet Archive, than something like Google Books.

[0] - http://www.cmu.edu/news/archive/2007/May/may24_recaptcha.sht...

[1] - http://news.bbc.co.uk/2/hi/technology/7023627.stm

johnfn · on March 9, 2017

> However, the actual end user who has to fill out the recaptcha does not benefit in any significant way.

This isn't true. The end user benefits in a very obvious way by being able to use a site that hasn't been crippled or spammed by bots. It is very possible that many sites either would not exist or would be of much worse quality without a recaptcha-like service.

Everyone profits from recaptcha.

mrighele · on March 9, 2017

It is "free work" since I'm helping training Google's algorithms and I don't get money back. Note that the service is free for the site owner, but not for the end user, which has to invest his/her own time.

danso · on March 9, 2017

A site uses captcha because it is threatened by automated use. A site uses Google's CAPTCHA because the site cannot afford to build a (effective) CAPTCHA. If Google required the site to pay to use their CAPTCHA, those costs would be passed in some way to the user.

In other words, if Google's service didn't exist, you'd be experiencing less of the site, or no site at all if they chose to use a paywall that you refuse to pay for.

mrighele · on March 9, 2017

I am not saying that the service is not useful. It certainly is . What I am arguing is that it requires the users to do some amount of work and so it is not free for them, and I am of the opinion that Google benefit from it much more than I do.

Anyway, I am not completely against it, as I said above depending on the situation I make use of it.

geofft · on March 9, 2017

It's the usual comparative-advantage thing, though. Five seconds of your brain reading something is essentially worthless to you (unless someone is paying you to read this comment), but of value to Google. The resource behind the CAPTCHA is of minimal value to Google (what are they going to do with an account at startup-of-the-month), but of more value to you. The transaction is mutually beneficial.

The value Google gets from the thing they want, and the value you get from the thing you want, aren't particularly comparable.

Karunamon · on March 9, 2017

The alternative would be you're doing different work (i.e. more intellectually demanding) on each different site as they invent their own CAPTCHA, and rather than benefitting anybody, it would just be an annoyance between you and the content you want to reach.

IMO, this is strictly better.

sowbug · on March 9, 2017

Taxes work the same way. You pay money (or, if you wish, you work for money you don't get to keep) but don't get money in return. Rather, the money goes toward societal improvements. You can debate whether they're societal improvements you personally care about, but it's not a very interesting debate, because societal benefits aren't meant to be uniformly attractive to everyone.

kuschku · on March 9, 2017

Except, everything funded by my taxes has to be open, freely accessible, etc.

Google is using ReCaptcha as measure to get an unfair edge over competitors and make competing even more impossible.

danso · on March 9, 2017

> Except, everything funded by my taxes has to be open, freely accessible, etc.

You must not be living in the U.S. Even as strong as the FOIA and public records laws are, there's a huge amount of information and records that are exempted from free access:

http://www.rcfp.org/browse-media-law-resources/digital-journ...

coroxout · on March 10, 2017

I'll take the mangled words over the new version, where Google shows you a variety of photos and you have to pick which ones belong to a category.

Does a sweet pastry count as a cake? Is this stretch of water a river, a lake or the sea? Are those trees I see in the distance on this photo of a mountain? Does the front of a restaurant or a veterinary surgery count as a "storefront"? As a British English speaker I had to guess at that last one, and some of the others seem culturally dependent too.

Or, my personal nemesis, the ones which ask you to select the squares containing road signs, and there's always a couple of squares containing a 3-pixel strip of the very edge of the road sign, and you don't know whether you're supposed to count those or not.

annnnd · on March 9, 2017

GP is talking about visitors doing free work for Google. Googel can use the results to train its algorithms - this is a big set of data and very valuable.

Of course the site (and with it visitors) benefit in this exchange, so I agree with you that it is not entirely "free".

goodplay · on March 9, 2017

I actually feed google false information (select rectangular objects when asked to select traffic signs, flat surfaces when asked for water surfaces, and so on). It'll probably be filtered out when aggregated with other users, but I like to think that I'm at least influencing it even if a little.

I wouldn't have a problem if they'd open the data set, but I'm not exactly keen on being forced to work for free.

arjie · on March 9, 2017

So you'd do the work if it were not productive but not otherwise? Like, if they just threw away your input into the CAPTCHA after confirming you're human, you're okay with that? Or do you just hate CAPTCHAs on principle?

Freak_NL · on March 9, 2017

It's an ethical distinction for some. I am completely fine with a type of captcha that helps improve some public domain or free/libre licensed resource (e.g., proofreading words scanned from public domain works). I am not okay with helping some megacorporation sanitize their proprietary database or training their machine learning neural networks for free.

As mentioned elsewhere in this topic; I'd much rather spend some of my free time to help improve OpenStreetMap (and I do).

arjie · on March 9, 2017

But what if it's a generic captcha. It doesn't improve anything. Are you okay with that?

m_t · on March 9, 2017

A generic captcha doesn't use the work done by the user. I'm not the person you're replying to, but I think I can guess what could be their answer.

arjie · on March 9, 2017

What would it be? It's not clear to me.

Going from throwing away the work to improving Google's image recognition is a Pareto improvement so I'm interested in the justification.

Freak_NL · on March 9, 2017

But to me this is not a Pareto improvement (i.e., this is subjective). I consider the bolstering of the hold Google (and Apple, and Amazon, and Microsoft) has over a number of common domains (and in a broader sense our lives) to be undesirable and unethical. So when I correctly teach Google's proprietary image recognition software what a traffic sign, mountain, store front (ugh, these are the worst), or tree is, I am (even in the strict economic sense of a Pareto improvement) harming myself.

arjie · on March 9, 2017

Ah. All Right. Thanks.

yjftsjthsd-h · on March 9, 2017

Not everyone is using the same ethical model as you?

arjie · on March 9, 2017

Probably. If they were why would I ask them to justify their position? It would trivially correspond with mine.

michaelt · on March 9, 2017

Imagine you're at an airport and an inspector takes your batteries out of your luggage. All other things being equal, does it matter if he tosses them in the trash, or keeps them for his personal use?

From one perspective it doesn't matter - your batteries are gone either way.

But from another perspective, inspectors who keep what they find have completely different motivations - making the same outcome much more sinister and corrupt.

arjie · on March 9, 2017

It makes sense there because the actual consequences are undesirable. What are the undesirable consequences we're incentivising here? That Google will cover the entire web in captchas?

kuschku · on March 9, 2017

That Google gets an unfair advantage compared to other companies, making it harder to start a new company able to compete with Google on Machine Learning? (Which is already impossible, because Google uses the knowledge from ReCaptcha, GMail, etc unfairly)

quickben · on March 9, 2017

For a more accurate money analogy: Imagine he resells them.

gnarbarian · on March 9, 2017

this comes to mind:

https://www.youtube.com/watch?v=gd7fE9_7_7A

kasperni · on March 9, 2017

But you use their service for free, right?

goodplay · on March 9, 2017

Neither the host or google ask for money in exchange for their services. The profitability of their chosen business model is not my concern.

If they want something in return, perhaps they should consider setting up a paywall.

Klathmon · on March 9, 2017

What's going to stop you from just stealing the content from behind the paywall because the profitability of their chosen business model is not your concern?

goodplay · on March 9, 2017

Their competence in restricting access to the information and preventing its decimation, I suppose.

Klathmon · on March 9, 2017

And you really believe that to be ethical and normal? Do you apply those same beliefs to the physical world? Is it okay to just take goods and services with no payment as long as you are physically able to?

goodplay · on March 9, 2017

No, I do not believe it is okay to take physical goods or receive services with no payment. I do however believe that people should be free to send and receive any information they wish, regardless of the wishes of the "owner" of said information. Sharing information without the right holder's involvement will not make them loose access to it.

If you, as a site owner, don't want people to share information your server sent to their computers, implement strong DRM or don't allow off-premise access altogether. You might loose many prospective users, but at least your information will remain "safe". Probably.

If you want to share information on a website but can't think of a reasonable way to make money of it, that's your problem, not mine.

Lastly, I fail to see how a discussion on copyright relates to recaptchas or even site-owners' failed business models. Unless you can get this thread back on topic, perhaps we should continue this (most likely frutless) discussion somewhere else.

Klathmon · on March 9, 2017

I feel this is on topic as it relates to you using Google services without payment.

This isn't information sharing, this is using a service which has costs (hardware, software, employee time, power, cooling, etc...). You are using the ReCAPTCHA service. If you don't pay them for that (either in time or in money) then in my opinion you are stealing the cost of that service.

Just like how if you got a taxi, then just walked away without paying. Yeah, you aren't stealing any physical goods, but you are costing the taxi company something and you aren't paying for it. And if you agreed to get a free taxi ride in exchange for washing their taxi car for them, and you just walked away at the end, that would be stealing in my eyes.

Using the reCAPTCHA service lets them use the data from it for machine learning. That's the reason they are providing it for free, and if you attempt to use the service without paying (or in your scenario maliciously provide false information, which not only doesn't contribute, but can mess up the results from those who do), I feel you are stealing it just the same.

goodplay · on March 10, 2017

I disagree that it is stealing because I never agreed (or wanted) to participate in this program. The site owner did.

I'm under no obligation (moral or otherwise) to make honest contributions. If we wanted to continue with your analogy, it'll be like being forced into a taxi, then driven one meter, then forced to pay a large fee because the landlord doesn't want me to walk on his lawn. I didn't choose to ride the taxi. The landlord did (ignoring saner options).

If google wants a revenue from their service, they should consider charging site owners.

If google wants to train their NNs, they'll have to pay people to classify data for them.

If google doesn't want to do either, and doesn't want people like me contributing to their system, don't let site owners allow us to use their site.

Otherwise, I think google should shutdown the service.

Klathmon · on March 10, 2017

If you really want to keep using the analogy, it's like you going to a restaurant and they require you to take a taxi to the actual location (and tell you the cost upfront), and you just take it then refuse to pay.

You do have the option of not using a service that uses recaptcha. Nobody is forcing you to fill them out, you are making the choice to do it because you want the service the site owner is providing, and the site owner wants recaptcha.

goodplay · on March 10, 2017

Nah, the owner wants to block scrappers.

If a taxi company keeps getting screwed over by patrons of a given restaurant, the company should stop providing service for said patrons, seek compensation directly from the people who require their service (the restaurant), or seek legal action against either the patrons or the restaurant.

Demanding otherwise is just naive.

Sean1708 · on March 9, 2017

A paywall is going to stop far more people from using a product than a captcha.

kalleboo · on March 9, 2017

4chan used to have a campaign where they'd collectively replace the obviously-unknown word in the ReCaptcha text captchas with the n-word to poison the results...

simonh · on March 9, 2017

Ouse a lot of free Google services like Gmail and Google Maos, so I don't mind helping to improve those services. I'd feel hypocritical otherwise.

stephenr · on March 9, 2017

Because you feel that data mining everything you do and serving you creepy-level ads isn't enough?

patrickaljord · on March 9, 2017

What's wrong with showing me relevant ads?

stephenr · on March 9, 2017

The problem is the method of building the "profile" that determines what's "relevant" for you.

gingerbread-man · on March 9, 2017

Google lets you turn off "personalized ads" if you don't want them. https://support.google.com/ads/answer/2662922?hl=en

icebraining · on March 9, 2017

That only lets you stop seeing the personalized ads. It doesn't say it stops tracking you.

patrickaljord · on March 9, 2017

What's wrong with them building a profile of me as long as I consented to it? Are you opposed to consenting agreements between willing adults and companies?

stephenr · on March 9, 2017

Well besides the creepy factor of what this data could be used for, the average user doesn't "consent" to it, because they have no idea it happens.

patrickaljord · on March 9, 2017

Then in this case the problem seems to be ignorance. Ignorant people will always find themselves in trouble though. Anyway, you still haven't proved that they wouldn't be ok with this even if aware of it. I've talked to many people about this and most were ok with it. I think you're assuming a lot here about what others care about.

It seems like you have a problem with people consenting to things you don't like, just like homophobes have problems with gays because they personally don't like consenting to same sex intercourse and think gays are ignorant of some facts (they're going to hell etc), maybe people don't care or believe in these facts and are just enjoying themselves, ever thought of that? What's the problems if other people enjoy things you don't? Live and let live.

quickben · on March 9, 2017

Because, in 5 , 10, or 200 years. When societies (or rules) change, it will be used to discriminate or worse.

"Whoever doesn't study history is doomed to repeat it.'

patrickaljord · on March 9, 2017

Well, if we have people willing to kill us based on the type of socks we like or porn we enjoy, we have a bigger problem than the fact that this information is being collected. It's not like people willing to kill for such ridiculous reasons would be stopped by the lack of available data. Crazies don't really care about fact and data in general. I think you're assuming a bit too much about them.

Obligatory xkcd about it https://xkcd.com/792/

tonyedgecombe · on March 9, 2017

It modifies your behaviour in ways that aren't in your interest.

piyush_soni · on March 9, 2017

And how's that different than advertisements in News Papers/magazines and TV? People never seemed to have a problem with that, and no one ever said "you are the product". Somehow, that is associated mostly only with Google.

icebraining · on March 9, 2017

People have always said that, it's just that the mediums we had before were much more censored than the Internet.

“You are the product of t.v.(...) you are delivered to the advertiser who is the customer. He consumes you”

https://en.wikipedia.org/wiki/Television_Delivers_People

xg15 · on March 9, 2017

And if at some point, the industry shifts to direct brain-stimulation to influence behavior, someone will say "how is that different from targeted advertising? People seemed to have no problem with that..."

How exactly would you expect that "having a problem with it" manifests? By not watching TV and not reading any kind of newspaper?

mirimir · on March 9, 2017

Fast-forwarding through ads has been an extremely popular feature for VCRs.

sdiepend · on March 9, 2017

The precision with which it can be done.

dredmorbius · on March 9, 2017

First off: tu quoque. You're changing the argument, which I'll take to mean you've conceeded the point.

But since you ask, it's very much the same, and people have very much had a problem with that, which raises a second fault with your argument: false premise.

Noam Chomsky. Pretty much his life's work.

Jerry Mander, Four Arguments for the Elimination of Television. (NB: Mander is a former ad executive himself).

Hamilton Holt, a magazine publisher himself, wrote of the fundamental problem in 1909, Commercialism and Journalism. https://archive.org/details/commercialismjou00holtuoft

I.F. Stone spoke of the problem in 1974 on "Day at Night". The video is now on YouTube: https://m.youtube.com/watch?v=qV3gO3zxQ1g

Robert R. Murrow's "Lights and Wires in a Box" criticised generally the television industry, in the 1950s. http://www.rtdna.org/content/edward_r_murrow_s_1958_wires_li...

A couple of Stanford researchers in the 1990s recognised the corrosive effects of advertising on incentives in the provision of online services. The prior awareness didn't save them from faling into the same pit: http://infolab.stanford.edu/~backrub/google.html

Niel Postman. Technopoly and Amusing Ourselves to Death.

Vance Packard, The Hidden Persuaders

Naomi Oreskes and Eric Conway, Merchants of Doubt looks at influencing on a much broader basis.

Banksy's art is, in many regards, a direct refutation of advertising. The letter attributed to him on the topic is originally by someone else however. It remains exceedingly good reading: http://thefoxisblack.com/2012/02/29/banksy-on-advertising/

There's a large literature on the subject.

Generally, psychological aspects of advertising: http://www.worldcat.org/search?q=su%3AAdvertising+Psychologi...

Criticisms of advertising: http://www.worldcat.org/search?qt=worldcat_org_all&q=critici...

At Wikipedia: https://en.m.wikipedia.org/wiki/Criticism_of_advertising

eh78ssxv2f · on March 9, 2017

There is an option somewhere in Google settings to turn off personalized ads.

Filligree · on March 9, 2017

Which makes the ads much poorer. You'd need a particular philosophy to want to check that box; most people, I think, would use a blocker.

patrickaljord · on March 9, 2017

Why isn't it in my interest to be shown ads that offer me goods and services of my interests?

clap · on March 9, 2017

Because ads are evil. They are bedrock of all worldwide bullshit and scam. They make seller don’t care about reputation, competitors and eventually product quality. They transfer products completion into ads competition. They make all products more expensive because of this “advertising tax”. They turn off customer’s head to make a zombie who buys best advertised product instead of one with the best quality. They are good only for advertising companies and scammers.

patrickaljord · on March 9, 2017

Ads are a great way to make your product known. Have you ever ran a business? At some point if you want to participate in the market of voluntary exchange of good, services and ideas, you need to advertize. The only countries where ads are forbidden and nowhere to be found are North Korea and before that USSR. Very depressing places. I love seeing people expressing themselves and showing up what they've built and trying to sell it. What's wrong with that? Advertizing oneself is one of the most human treat we have and has always existed one way or the other.

clap · on March 13, 2017

Yes, I’m talking this from my business experience. I’m selling software. It was cracked and thief is easily selling it with adwords now. He has top ad’s position in Google (despite several dmca notices and not only from me) and Google doesn’t care about it as long as they get good money from thief. Google must share responsibility with the scammers for false advertising and the customers and sellers should have a way to make Google pay for promoting scam. The ads cost should be proportional to the possible damage.

tonyedgecombe · on March 9, 2017

I'm amazed that isn't obvious to you, they are there to influence your buying habits in the interest of the advertiser.

oblio · on March 9, 2017

Only (mostly?) if you don't know that they're trying to.

kavi87 · on March 14, 2017

That's dangerous thinking, you know it.

bitmapbrother · on March 9, 2017

Is someone forcing you to use their services? Stick to DDG if you don't like the contract.

icebraining · on March 9, 2017

I think the point is that they're payment enough, no need to feel hypocritical for disliking the captcha.

bitmapbrother · on March 9, 2017

[flagged]

Ntrails · on March 9, 2017

I do that not because I want to not be tracked, but because I am bad at computers and don't want to accidentally install a virus because blatting and starting again would be an bunch of effort and it's not like my stuff is backed up or anything.

Sometimes though I wonder if I shouldn't turn adblock off, for example, and then I'm at work and some website auto-plays video and I remember.

Filligree · on March 9, 2017

Imagine, if you will, a scene like this. It is dusk. You are out running in the park, trying to get exercise before the last light fades. Suddenly you realize: A group of people have been closing on you, boxing you in from all sides.

You try to run. Your muscles burn, you heave for breath. They match your pace exactly, coming closer... and closer... and closer. You cast your gaze desperately from side to side. Isn't anyone around? You're scared. You're not a strong man, and these people all look like body-builders and criminals. No, now that they've come closer, you're not sure they look like men at all. Is that a tattoo, or is it... is it a tusk?

Then you stumble over a rock, fall, and scrape your hands. While you're rolling around and trying to get back to your feet, the men stop in a tight circle around you. They look down at you.

"What," you begin. Your voice is quavering.

"Take backups," the men say, as one. "Take backups, for you never know what may happen."

Then they disperse into mist.

yjftsjthsd-h · on March 9, 2017

Having backups is important. Not needing to use them, however, is still a very good thing.

Ntrails · on March 10, 2017

GREAT NOW I AM HAVING NIGHTMARES

Filligree · on March 11, 2017

A job done, and well done.

quickben · on March 9, 2017

And people that don't, often have to buy/run antivirus, anti-malware and rootkit scanners/cleaners.

Say, do you run also without a firewall and keep mail and other ports open, just to prove you are less paranoid than the rest?

dredmorbius · on March 9, 2017

In many respects, yes.

Travel to/from the United States without some social media footprint is on the verge of becoming suspicious in and of itself.

Benjamin "Mako" Hill has long since observed "Google has most of my email because it has all of yours". https://mako.cc/copyrighteous/google-has-most-of-my-email-be...

Facebook creates shadow profiles of non-member identities.

It goes on.

stephenr · on March 9, 2017

Did I claim to be using their services? Stick to what you know if you don't like reading what I actually write.

I was questioning someone feeling "hypocritical" for using other google services but disliking their new recaptcha tool.

As for "the contract": I'd wager that for most people, most contracts they see/sign are easier to get a clear understanding of, than Google's business model of "free" services.

jhall1468 · on March 9, 2017

> As for "the contract": I'd wager that for most people, most contracts they see/sign are easier to get a clear understanding of, than Google's business model of "free" services

Oh bull shit. The contracts you sign (or click I agree to) are thousands of legal definitions and words that most people don't even read, much less understand.

Google's contract is very simple: We show you ads (the more we understand your interests, the more targeted our ads) and you get to use our stuff without monetary cost.

icebraining · on March 9, 2017

Have you tried asking your non-IT acquaintances whether they knew and understood the tracking part? Because my experience is that most people don't, even if you think it should be obvious.

stephenr · on March 9, 2017

> We show you ads (the more we understand your interests, the more targeted our ads) and you get to use our stuff without monetary cost.

And where do they spell that out?

Taylor_OD · on March 9, 2017

But you don't feel more than weird about using their search engine for free?

I think of the personal data they collect from me and these types of things as my payment for their "free" services.

jlebrech · on March 9, 2017

at least it's better than a CIACaptcha "please click on the terrorist on this photo"

dexterdog · on March 9, 2017

Your nut writing for Google. Your working for the site that you want to visit. That site doesn't want to pay to do its own verification.

bluthru · on March 9, 2017

I much prefer that to being tracked.

quentindemetz · on March 9, 2017

Google didn't invent this: Luis von Ahn, a CMU grad student did in 2000. Google acquired it in 2009.

https://www.cylab.cmu.edu/partners/success-stories/recaptcha...

dsp1234 · on March 9, 2017

The above poster doesn't say that Google invented recaptcha, only that their use of it to have humans clean street view data is very Googly.

The original recaptcha was only text from book digitization projects.

_0nac · on March 9, 2017

The original pre-Google reCAPTCHA was used for digitizing books. Using it to read house numbers for Street View was a Google addition.

bbarn · on March 9, 2017

The original inventor also went on to found Duolingo, which uses teaching someone a language as a way to translate web pages as a service. I like that model a lot more, as at least you get something in exchange (the language learning)

joshuamorton · on March 9, 2017

(am google)

To be fair, if you use google maps, and don't have any ethical issues with helping googler, you do get a return on investment, in that you get improved ML capabilities and improved address recognition in google maps.

mirimir · on March 9, 2017

I see lot's of those on Tor browser. There's nothing about entering street numbers, just identifying images that include street numbers. But perhaps that's also valuable. And what about the other major types, such as rivers, mountains, store fronts and street signs?

obmelvin · on March 9, 2017

I believe those are to provide data for image Segmentation (localization of an image is harder than pure recognition)

the8472 · on March 9, 2017

It depends on whether you're presented with the v1/noscript version or the v2 captcha. Desktop users may prefer the noscript version since it's faster to fill out with a keyboard if you're already typing into a form anyway instead of having to switch to the mouse.

gexcolo · on March 9, 2017

Both v1 and v2 have noscript versions. Here is what the v2 noscript version looks like:

https://vc.gg/B9zmj4hi https://vc.gg/CTskizZe

mirimir · on March 9, 2017

I'm talking about image-based v2. I have better luck with them if I allow Javascript. I don't recall seeing the code pasting ones until recently. I often found v1 impossible. And buggy, in the sense that they kept repeating even when I clearly got them right.

the8472 · on March 9, 2017

That is terrible. I really prefer the v1 variant since I generally don't get the single-click for v2. Apparently they have discontinued it.

akerro · on March 9, 2017

Not sure if that's good for us. We're doing work for google for free. Google Translate will replace translators,

Google car will replace drivers, Google VR + map will replace city guides. We're giving them knowledge and powers to replace us, for free. Professions will disappear, people will lose jobs, Google will get richer and richer, will not pay me or you for fixing their suggestions in Google Translate or fixing maps or improving computed route from A to B.

No one is being paid for that work, but someone is monetizing it.

washadjeffmad · on March 9, 2017

I consider it more a donation towards posterity than getting stiffed on some short term compensation.

It's just an opinion, but if the things Google has created and shared into the whole of human knowledge towards the betterment of future societies are still being used in a century for my grandkids (and the rest of humankind) to benefit from, then I think it's better than not having it because a few million people who couldn't have done it themselves said 'no' over a pittance.

I can't speak for you, but my dying thoughts won't be a sour reflection on all of opportunities to monetize my existence I might have missed.

Pigo · on March 9, 2017

There is some trade off when you think of the benefits we get for free, in return, from Google. I barely recall what it was like to buy a map, writing down directions, and praying to God I didn't get thrown off on the way.

iainmerrick · on March 9, 2017

the things Google has created and shared into the whole of human knowledge

Do they really share it, though? You can't download and reuse data from Google Maps or Street View the way you can from OpenStreetMap, say.

I'm not saying this is a fatal problem. Google Maps is a lot more popular than OSM so the free-with-ads closed source approach clearly has a lot going for it. But the data isn't public, they own it.

Taylor_OD · on March 9, 2017

My brother teaches math to at risk youth in the Denver area. He has students that don't speak a lot of English (their primary language is Spanish) and he's trying to explain math concepts to them. He realized he can use google translate and get 95% of the information across that he needs to.

I've never personally hired a professional translator and I don't think I would if I was going to travel to a country where the majority of people don't speak English because I assume a professional translator is expensive. However if I know I have free access to google translate which will be useful enough for me to navigate by myself I would be much more likely to go on such a trip.

I'm sure some jobs will be lost but at least for middle class people who arnt able to afford translators the technology will be used to communicate better without affecting professional translators.

There are plenty of similar professions that exist but not utilized by the middle/lower class due to cost. I would love to hire a interior decorator, as I'm sure plenty of home owners would, but I haven't due to the cost and likely never will. If Google (or any other company) offered a service where I put photos of my home online and gave me a free layout with online links to purchase the furniture I would be thrilled and no interior decorator would be out of a job because I wasn't going to pay for one anyway. I think its all about the level of quality you want.

stoolpigeon · on March 9, 2017

Google translate is decent for some languages and really poor for others - so I'd run it by some native speakers before you rely on it for anything important.

User_424 · on March 9, 2017

Work in the Language Service Industry. Google translate for a few languages is decent at best for social conversations. It is never used in serious business translations. The technology still has a long way to go.

pizza234 · on March 9, 2017

This assumes the amount of work is fixed. Some people consider it a fallacy:

https://en.wikipedia.org/wiki/Lump_of_labour_fallacy

and some don't, so this point of view is arguable, to say the least.

Regardless, translation is an intellectual work, driving (depending on the point of view), isn't. Same goes for city guides; people doesn't necessarily prefer an electronic device to a human.

schrodinger · on March 9, 2017

You could argue that your payment is access to all of the free services they offer (e.g. Gmail)

juice_bus · on March 9, 2017

Which is also mined for data for targeted advertisement, wouldn't that be the payment for access?

_zn02 · on March 9, 2017

Yes, but now we're really just arguing about the actual costs of those services.

akerro · on March 9, 2017

Will that feed me or make rich richer and poor poorer?

deadbunny · on March 9, 2017

Is it meant to?

taway_1212 · on March 9, 2017

As long as automatic translators/drivers etc. are better and/or cheaper than human ones, everyone (except people in said professions) will benefit.

throwaway2016a · on March 9, 2017

Goog-411 predates that quite a bit an dis arguably more Googley. Launching a free 411 service (at a time when you usually paid per minute for that) and using it to improve your voice recognition algorithms.

It was essentially "OK Google" for dumb phones.

For anyone not familiar, 411 (in the US at least) was the directory service number. You could call it to get things like what time a restaurant was open until. Directions to the airport. Etc.

macrael · on March 9, 2017

They also gave out free voicemail for the same reason, no?

2_listerine_pls · on March 9, 2017

The idea existed since before. Google just adapted it to images other than text after they bought it.

dzhiurgis · on March 9, 2017

I wonder how long until they place some native ads in there so that publishers could earn a penny.

Waterluvian · on March 9, 2017

"Look at the picture to the right. How much is the new Cheddar Cheese Apocalypse Whopper for a limited time only?"

q3k · on March 9, 2017

Solvemedia does something [1] like this. The experience is horrible.

Doesn't help that I only see it on fairly shady websites, same kind that use adf.ly, pop-unders and other obnoxious monetization schemes.

[1] - http://solvemedia.com/publishers/

ge96 · on March 9, 2017

I wonder about the error-check thing. If at least one paid employee had to enter a first-set of data that was 'absolutely correct' before being seen by multiple people online which then 'solidifies' the correct responses. I briefly read about this somewhere but still not sure. I mean it does obviously work in the sense of entering something blatantly not correct is flagged.

ceejayoz · on March 9, 2017

Probably no need to have someone seed it. You just show the same image to a thousand people, and if 900 of them answer it one way that's almost certainly the correct answer.

ge96 · on March 9, 2017

Interesting. I'm inclined to think that this is not good. But it makes sense consensus, just seems to me if everyone is wrong. But the quality-check (seed guy) could be wrong as well.

ybrah · on March 9, 2017

I make sure to always spell it wrong on the characters I can tell the botnet couldn't figure out

Its a small rebellious act of mine (that is for moot because I'm sure they give the same captcha to other users for verification)

tzm · on March 9, 2017

How can you translate the entire internet? https://www.youtube.com/watch?v=cQl6jUjFjp4

danielrhodes · on March 9, 2017

https://en.m.wikipedia.org/wiki/Human-in-the-loop

Expect to see more of this.

gingerbread-man · on March 9, 2017

Clicking on the link I thought Wikipedia just unveiled a radical re-design haha. (It's a link to the mobile website.)

wisebit · on March 9, 2017

"Powering these advances is machine learning and a combination of threat yadda yadda"

Looks like they do little more than just check for a Google cookie [1].

[1]. https://www.blackhat.com/docs/asia-16/materials/asia-16-Siva...

edit: still, it's far better than the previous state of captchas. I'm glad they did this. But it's like for anything to be considered "advanced" or "good" in tech lately, it has to have been powered by "machine learning".

r721 · on March 9, 2017

Yeah, I have third-party cookies blocked and have never seen "i'm not a robot" captchas, it's "street signs"/"mountains" ones all the way.

xg15 · on March 9, 2017

Well, technically, it is machine learning. Only that the machine learning was likely part of the usual data mining on google accounts and not much specific to the captcha problem...

(That said, whenever I used that checkbox widget they had before this announcement, there was a noticeable framerate drop in the browser while the thing was doing its magic. So I suspect, they are at least doing some browser fingerprinting/benchmarking to see if the widget runs inside selenium or a stock browser.

I also remember rumors that they analyze keyboard/mouse input on the page and check if it looks "human", but I'm not sure if that's true.)

mpeg · on March 9, 2017

Yeah it's basically browser fingerprinting (incl. GPU fingerprinting, hence the slowdown) plus google cookie.

If your browser is standard (AKA no anti-fingerprinting plugins) and your advertising cookies are not blocked (privacy or adblocker plugins) you'll probably pass with no issues.

If either of those is not true, you have to solve a bunch of image captchas.

Mouse/keyboard input analysis was just marketing talk; at least when they first released the nocaptcha it wasn't even captured.

vidyesh · on March 9, 2017

This is so confusing. It doesn't explain how it works nor a demo page nor the reason behind why it went invisible.

cool_shit · on March 9, 2017

It doesn't seem to be fundamentally different than reCAPTCHA. They will probably replace the click-this-checkbox box with a set of elements already on the hosted page. Only indexing a page's HTML isn't good enough -- it's better to also know how people interact with the HTML. Unfortunately, as soon as they click a link on Google, Google is no longer invited to the party, and is blind to how the user interacts with the page.

The next best thing a search company can do is have every website willingly track their users' mouse and key movements, and then willingly send all of that data to the company's inbox. In return, Google provides them with a binary classifier trained on all of the user click-stream/click-move data which determines whether or not the user is a bot!

It's an OK deal for the website owner; it's a great deal for Google. Not to mention, the user is now sending anonymous data to Google, at the expense of the website's Privacy Policy.

Google gets website owners to willingly install live-cameras on every corner of their website, and then willingly send over all of the footage, in exchange for "protection" from bots. Cough The Government gets citizens to willingly fill out lengthy tax forms, and then willingly send over a bunch of money, in exchange for "protection" from criminals. Cough

Quarrelsome · on March 9, 2017

why are you implying that taxation is somehow a protection racket? That's a pretty radical notion wrapped in a matter-of-fact language. Strikes me as a bit dishonest. Taxation pays for plenty of things besides police and military.

halflings · on March 9, 2017

Agreed. The documentation page [1] has much more information.

[1] https://developers.google.com/recaptcha/docs/invisible

endless1234 · on March 9, 2017

Instead of the user having to click the "I am not a robot" checkbox, then the submit button (or w/e), this basically binds clicking the submit button to reCAPTCHA. So whatever checks clicking "I am not a robot" would run are instead done automatically when submitting.

No idea how the additional prompts (e.g. "select the parts of the image with a street sign") are shown nicely in this "invisible" UX though.

iaml · on March 9, 2017

They are shown in a lightbox in the middle of a page.

illnewsthat · on March 9, 2017

Here is the demo: https://www.google.com/recaptcha/api2/demo?invisible=true

iaml · on March 9, 2017

To get the actual verification process open in incognito mode.

markdown · on March 9, 2017

> This is so confusing.

Really?

> It doesn't explain how it works nor a demo page

Imagine a web form without reCaptcha. Do you really need a demo of that?

> nor the reason behind why it went invisible.

Because Recaptcha had an annoying, bad, terrible UX.

mulmen · on March 9, 2017

You have done nothing to answer the GP.

How does Google determine if the captcha should be shown?

What are the "adaptive captchas" that are shown to suspect users? A demo would do a great job here.

How does an invisible captcha "create value by applying human bandwidth" if the premise is that humans never see the captcha?

esond · on March 9, 2017

I also find it confusing. No need to be rude.

malikNF · on March 9, 2017

While I like the idea of not having to deal with these annoying ReCAPTCHA prompts, something somehow feels "intrusive ?". I mean does this mean google is going to keep track of what I would be doing when I visit a site?

Say for instance I am signing up for a website, does the password I enter get sent to google servers to be analyzed now?

_m7bj · on March 9, 2017

>I mean does this mean google is going to keep track of what I would be doing when I visit a site?

Oh buddy, I have bad news for you...

https://support.google.com/dfp_premium/answer/1716364?hl=en

https://www.google.com/analytics/#?modal_active=none

https://www.doubleclickbygoogle.com/solutions/measurement/

popol12 · on March 9, 2017

Wow, I never heard of Pixel Tracking. That sounds super evil. Are there browser plugins to fight back on this one?

_m7bj · on March 9, 2017

My standard internet security suite is ublock origin, noscript, self-destructing cookies, privacy badger and httpseverywhere.

Privacybadger does an ok job against tracking pixels. It uses machine learning to try to detect when a specific url or domain seems to be "following" you around the internet, which could indicate a tracking pixel among other things. It then blocks these entities and gives you the option to override.

And yes, I do think tracking pixels are super evil. It could be standard for websites to have a bar at the bottom with little logos[1] showing the companies that are tracking you, with the logos being the remote-loaded content. The fact that these companies feel compelled to make it 1x1 transparent pixels tells me very clearly that they know they are doing something people don't want them to do, because they've gone out of their way to hide it. It's a clear misuse of browser capabilities, yet they do it anyway. What a pack of cunts they are.

[1] or textual short names company ticker style if you want to minimize bandwidth

r3bl · on March 9, 2017

They're an even bigger thing in emails. Usually, companies would include a pixel from an external resource and then figure out if you've read the email or not by looking at was that image loaded or not.

IIRC, GitHub does that. If you read the email of some notification, it won't show you that notification in your notifications. The solution is to block external image loading in your email client (I know that Thunderbird has that, and I know that Zoho's email client on Android has that).

username223 · on March 9, 2017

Yep, tracking pixels are among the oldest forms of internet surveillance, predating the far more aggressive and intrusive JavaScript companies use today. They're one of the reasons why most mail clients don't load images by default. They only give information on page loads, not obsessive behavior tracking, but they're harder to block.

They're why I block doubleclick.net, google-analytics.com, etc. in my hosts file rather than just blocking their JavaScript.

littlehood · on March 9, 2017

Gmail does something to divert it - it downloads all the resources once and CDNs them for you, so the email author can't see who and where opened the email.

dvfjsdhgfv · on March 9, 2017

Adblockers such as uBlock already block many of these since they're hosted by ad agencies. If someone is determined they can bypass the filter easily though; you'd need a static filter for each of these then. The only global solution would be to block all remote images which would likely break many websites. Also, nowadays so many sophisticated tracking methods exist I'm not sure it's worth the hassle.

dandelion_lover · on March 9, 2017

I use RequestPolicy. It prevents a web-site to call another web-site unless I allow.

Grangar · on March 9, 2017

Check out Ghostery.

dvfjsdhgfv · on March 9, 2017

It's just one more of the many elements of Google puzzle. They control most web searches, GA is a de facto standard of web analytics, Gmail is the most popular mail service, not to mention other free services, Android etc. etc., Google can easily track you all the way from the moment you open your browser (likely, Chrome...), through your searches, visits and actions. It's high time people realized giving so much power to one company, just for short-time convenience, is extremely dangerous. Google might have benevolent management now, but this can easily change, and it scary to think what could happen then.

tdb7893 · on March 9, 2017

The saving grace here is that there are alternatives for all those services and I can switch pretty easily at any time so honestly it's less urgent to me

sfifs · on March 9, 2017

don't they already? Almost everyone uses Google analytics and most websites that have ad placements use the Google network. Where do you think the data for those comes from?

return0 · on March 9, 2017

I expect to see more people putting recaptchas in their login screens, which means even more data funneled to google (after NSA i guess).

awqrre · on March 9, 2017

if javascript is enabled, it should be trivial to detect a human... (but of course they can log everything)

Viper007Bond · on March 9, 2017

You don't give bot authors enough credit. They can run JavaScript too and try to replicate human behavior.

daviding · on March 9, 2017

It's done by binding it to one of your own buttons, where I guess it does the 'jitter test' on that element (if it's a low risk IP).

https://developers.google.com/recaptcha/docs/invisible

retrogradeorbit · on March 9, 2017

What is "the jitter test"?

greenhatman · on March 9, 2017

That probably refers to check whether the mouse moves like a human or more bot-like.

Recaptcha does more than that though. It checks if you are logged into any other Google services, whether your browser user agent matches your actual browser, and I think one or two more things.

mpeg · on March 9, 2017

It's more urban legend than fact, recaptcha checks your google cookies and little else. They initially marketed it as looking for "human signs" such as mouse movements but they didn't, and AFAIK they still don't.

gregschlom · on March 9, 2017

I would guess testing that the mouse pointer moves semi-randomly over the button, in a fashion typical of the way humans browse a web page

reaktivo · on March 9, 2017

I can't image it being limited to jitter, I'm curious how they handle touchscreens.

Klathmon · on March 9, 2017

Seeing how you scroll the pages how you click your finger, the speed that you tap elements.

Honestly I feel mobile is easier to validate

manojlds · on March 9, 2017

What about the Tab and Enter users?

p1mrx · on March 9, 2017

Those users are basically robots already.

danso · on March 9, 2017

If you asking for a demo of it in action:

https://www.google.com/recaptcha/api2/demo

garganzol · on March 9, 2017

ASP.NET AJAX Control Toolkit provided similar functionality since 2007. The corresponding component is called NoBot. http://www.ajaxcontroltoolkit.com/NoBot/NoBot.aspx

"NoBot is a control that attempts to provide CAPTCHA-like bot/spam prevention without requiring any user interaction. This approach is easier to bypass than an implementation that requires actual human intervention, but NoBot has the benefit of being completely invisible."

Works like a charm even now, 10 years later.

greenhouse_gas · on March 9, 2017

Just curious. What happened to all the data made by the old reCaptcha (the one that OCRed books).

I know that they stopped the project, but did Google at least release the data (old public domain text of books)?

dingdongding · on March 9, 2017

I think that data is what made lot of old books searchable on Google Books.

fiddlerwoaroof · on March 9, 2017

Wasn't recaptcha originally associated with Project Gutenberg's digitization efforts?

fiddlerwoaroof · on March 9, 2017

I guess not. https://en.wikipedia.org/wiki/ReCAPTCHA#Origin

greenhouse_gas · on March 9, 2017

It says there that it was.

Seems like a bait-and-switch. Do free labor for a good cause (PD books), turns out you're just growing Goolge's library which can be taken down at a whim.

icebraining · on March 9, 2017

It says there that it was.

No, that was the Distributed Proofreaders project, which is unrelated (just used as an early example of crowdsourced OCR).

reCaptcha originally helped digitized the archive of the New York Times, but that was finished years ago.

colept · on March 9, 2017

What I want to know is what happens if you get 'trapped' in this invisible ReCAPTCHA?

The most frequent encounters with CAPTCHA's I see are rejected API requests over VPN.

chrisacky · on March 9, 2017

I was browsing Upwork the other week and I got trapped.

I couldn't load any more jobs, or view any more workers. I had to inspect the Network tab in Chrome, open the API request and then click a "I am not a bot" on their API page.

That was a poor implementation tbh.

distances · on March 9, 2017

I also expect I will start getting silent registration failures with this. I'm using uMatrix to block most third-party scripts, and it's usually quite clear when I have to allow extra things with traditional captchas.

Then again, it's allow-more-and-retry on many pages already so nothing new in that regard.

SN76477 · on March 10, 2017

I get them too often on mobile.

Geee · on March 9, 2017

Doesn't anyone else see it as a problem that Google's bots (and possibly CIA/NSA) can access any system using Google ReCAPTCHA? They can create unlimited amount of accounts on social networks, blogs, forums etc. and thus have unlimited social voting power on the Internet.

There is a need for a decentralized Captcha that can't be circumvented by anyone.

hmate9 · on March 9, 2017

Google owns google. If they really wanted to manipulate people they would manipulate their own search result and not go through the trouble of spamming other sites.

stavros · on March 9, 2017

Would not having ReCAPTCHA prevent this?

iainmerrick · on March 9, 2017

Using some other bot-filtering system would prevent it. Of course the makers of that filter would be able to bypass it, but I guess you could use multiple filters...!

stavros · on March 9, 2017

Can't you do that now, then?

visarga · on March 9, 2017

That whole page wasn't apple to explain WHY it is invisible and how it works if it's invisible.

tyingq · on March 9, 2017

Guessing it's your browsing history (courtesy of including js from a Google domain) plus mouse tracking.

It doesn't seem much different than the current "click here" one to me. They are just letting the page owner substitute their own button in lieu of the check box.

stavros · on March 9, 2017

Is that why I always have to solve a challenge? Because I use Privacy Badger and uBlock Origin?

mpeg · on March 9, 2017

Yes, recaptcha works based on advertising cookies.

vidyesh · on March 9, 2017

What about non-logged users/sessions?

tyingq · on March 9, 2017

I assume it pops up a challenge, again, just like the current one.

Edit: Yep. "Human users will be let through without seeing the "I'm not a robot" checkbox, while suspicious ones and bots still have to solve the challenges."

It really is just the current one, with the site owner's button instead of Google's checkbox.

Grue3 · on March 9, 2017

I remember when they advertised it with "you just need to click a checkbox". In reality, I always have to solve a streetview captcha, possibly several times in a row.

mpeg · on March 9, 2017

Let me guess, you use privacy enhancing browser extensions or adblockers? :)

neurostimulant · on March 9, 2017

Are you behind a NAT/VPN (i.e. your public ip is shared among many other users), or not logged in to google?

zumu · on March 9, 2017

I always wondered how blind people entered captchas. Does this finally make captchas accessible?

ry_ry · on March 9, 2017

There is usually an audio version available, presumably screen readers are pre-configured to present the audio option on popular captchas by default.

I'm curious now, will have to give it a whirl when I get into the office :D

Suspect Google will rely on their vast knowledge of people's browsing habits based off IP/account/ad-tracking/browser-fingerprinting to skip the user input aspects. Although that said, a screen reader won't have the standard physical interaction clues client-side that a user is a real person, mouse tracking for ex. is probably a moot point. Not really sure how Google will handle those, or if blind users will get a degraded always-on captcha experience.

Either that or a headless screen reader becomes the scraping/botting tool of choice.

provemewrong · on March 9, 2017

Speaking of. Audio captchas are one of the most spooky things I've heard. Noise in the background and muddled voice listing numbers. Think numbers stations, except they try their best to make the numbers unintelligible (just like regular captchas tend to do with written words).