Google will defeat its own captchas for you

doctorow · January 5, 2019, 3:08am

Originally published at: https://boingboing.net/2019/01/04/own-goals.html

…

reactionabe · January 5, 2019, 3:14am

Me: You realize the entire point of captchas is to render themselves obsolete by teaching machines to answer questions that currently only humans can?

People: What? That doesn’t make sense.

Me: [Links to this post]

RickMycroft · January 5, 2019, 3:15am

Don’t stop there!

Train Google’s TensorFlow to recognize bits of traffic signs, automobiles and storefronts.

Immutable_Mike · January 5, 2019, 3:19am

Oblig:

Glaurung · January 5, 2019, 5:29am

Soon Google will plug this hole by inserting some signature noise into its audible prompts that its speech to text engine will recognize and refuse to process. I’m frankly quite astonished they didn’t take care of this loophole already.

But that just means you have to use someone else’s speech to text processor for your comment spam bot.

phiis161803 · January 5, 2019, 6:56am

Boundegar · January 5, 2019, 7:13am

Why stop there? Translate them into Urdu, and then back again!

Jorpho · January 5, 2019, 7:13am

I haven’t seen “play this captcha as audio” in a long time. It’s always “click here” and “identify the images” and so on lately.

Remember that all-too-brief interval when the random words spawned by captchas formed the basis of humor? Click for album:

[Album] captcha comics

GilbertWham · January 5, 2019, 8:08am

At 91%, it’s success rate is about twice mine at solving the damn things.

Glaurung · January 5, 2019, 8:29am

I still see “play an audio version” or some such every so often. Depends on the company producing the captcha, I think, but some of them definitely provide a visually impaired, audio option.

jpc2769 · January 5, 2019, 8:43am

What I don’t understand about these, is if I try to just randomly click through, or intentionally get it wrong, it knows, and won’t let me through until I do it right. What is the AI learning if it already knows the correct answer?

LDoBe · January 5, 2019, 8:48am

That’s why there’s two words.

One is known, the other is unknown. If you get the known one right, then it assumes you got the unknown one right as well.

It also doesn’t just rate its confidence in the whole unknown word. If you plug something in that doesn’t match any of the letters its guessed for the unknown word then it assumes you’re poisoning the dataset.

jpc2769 · January 5, 2019, 8:53am

I was talking about the traffic ones, like when it says “click all bicycles,” if I just randomly click some squares or intentionally leave out a bicycle, traffic light, etc. it knows I did it wrong and won’t let me through until I do it right. What is it learning if it already knows where all the bicycles in the picture are?

LDoBe · January 5, 2019, 9:00am

Oh, that one I couldn’t say.

udqbpn · January 5, 2019, 11:54am

My thought is that some of those images have already been checked by other humans so the machine is confident due to those other humans. It’s similar to the text thing, some images it already knows for this or that reason, and if you get both the known and unknown ones wrong, it discards your answers

bobtato · January 5, 2019, 12:00pm

Modern CAPTCHAs often don’t require you to click anything, or will let you just check a box. I gather they are monitoring stuff like keyboard and mouse events on the page and can identify bots based on the timing of scrolls and clicks and what not. Those patterns are specific to the page, and hard to fake without empirical data from real users, which the page owner has but bots don’t.

When they do ask you questions, it may be mostly or entirely for their own selfish purposes. They show you some pictures they’re confident about (positive or negative), mixed in with some where they’re less confident. If you disagree with them on a confident match then you’re a witch. If not, they will take your opinion on the non-confident matches, and combine it with a bunch of other people’s opinions to get some useful training data. Potentially, since they’re tracking the shit out of you, they can eventually weight your answers based on how good you are at identifying things.

Basically, they’re not asking you “what is this image?”, they’re asking you to vote on whether the decision they’ve already made is correct.

anothernewbbaccount · January 5, 2019, 12:36pm

This variety drives me mad. They are shit. Living in UK I am supposed to know what a ‘crosswalk’ is? That ‘click all bicycles’ actually means ‘click any box in which even a single pixel of something that might be a bicycle appears’?

JoshNewswell · January 5, 2019, 12:40pm

Obviously, you guys never used Tor. When you try to search anything in Google while using TorBrowser, not only will you be hit with CAPTCHA almost every time, Google will also sometimes refuse to show you CAPTCHA, or will show it, but refuse to let you use audio CAPTCHA. And even if it let’s you do the test, it will claim multiple times that your performance is inadequate and force you to repeat the test. Then, when you pass, it will sometimes hit you with CAPTCHA again. Why? Hell if I know.

This got so bad that I almost never use Google with Tor these days, making do with DuckDuckGo.

jpc2769 · January 5, 2019, 1:39pm

Thanks!!!

Elmer · January 5, 2019, 3:09pm

Isn’t it obvious? It’s learning about US.

Topic		Replies	Views
YouTuber illustrates why "I am not a robot" image CAPTCHAs are universally hated boing	17	1182	August 17, 2020
Self-aware CAPTCHA overlords wrath	38	1304	March 8, 2021
Why Google’s CAPTCHA images are "so unbearably depressing" boing	41	2368	August 17, 2021
Let's build a robot to kill the creator of "I am not a robot" CAPTCHA image grids boing	54	3511	March 19, 2018
Video perfectly captures CAPTCHA anxiety boing	8	2057	July 20, 2015

Google will defeat its own captchas for you

Related topics