Originally published at: http://boingboing.net/2016/09/15/machine-learning-system-can-de.html
…
“Enhance 224 to 176…”
I remember watching a show on adaptive optics systems used for astronomy. The scientist who created the software said he could easily use it to descramble pixilated (blurred) images.
Always sounds weird to me how folks are expected to talk to machines in full, syntactically correct, sentences. If I were Deckard I’d just say “224 to 176”. What else is the damn thing gonna do - blur it all up?
Be terse. Be imperative. We have to let them know who’s boss.
I remember in the '87 Costner film No Way Out, a technology like this was portrayed.
I’ve read the paper and it does a remarkably good job of obfuscating what their results mean. Like most machine learning researchers, it looks like they cheat. When they report accuracy ratings for their results, they appear to mean the following: given a set of original images and a set of obfuscated images, the program can correctly guess which obfuscated image came from which original image (e.g. 80% of the time). That’s much easier than recognizing faces. Their technique only works when you have a (small) set of possible source images to choose from, one of which is guaranteed to be the correct unscrambled image. I’m struggling to think of any real world case where you need to de-obfuscate an image when you already have the unscrambled image before you start…
No, that’s not right. They say (for the AT&T face data):
Each individual has 10 images in the dataset, taken under a variety of lighting conditions, with different expressions and facial details (i.e., with or without glasses). For our training set, we randomly selected 8 images of each person. The remaining 2 were used in the test set.
This means that they scrambled the 2 images (per person) that the program hadn’t seen, then asked it to identify which person they came from, based on its previous analysis of 8 images per person.
As far as I can tell from reading the paper, the program never saw any unscrambled images.
Still a long way from ‘descrambling pixelated/blurred redactions’, though.
And this is why I ask people to not tag me in photos…
Ah, that makes a little more sense. So it could be used realistically in a TV show when they’ve rounded up a set of suspects, taken multiple pictures of each of them, and now need to find out which of them is in the pixellated photo from the crime scene… “enhance the training set!”
Once again, Doctorow (unsurprisingly) clickbaits and simplifies a headline with an exaggeration of the facts regarding the capabilities tested/found in the study.
If you provided the researchers with just the images in the “16x16” mosaic column they are not claiming the ability to depixelate them (83% of the time) back to the images in the “Original” column as the headline implies. HHHHHHhhhhhhhhh…
Oh, and if the actual culprit isn’t among the suspects you rounded up, you’ll never know, as the algorithm will just point you to the best match. Not much of an improvement over current procedure then…
When given different images of the same people, the algorithm could determine their identity with 57% accuracy, or 85% percent when given five chances.
Wait a damn moment. How did we get 85% from five consecutive attempts? I assume the algorithm’s top match can either be right or wrong (there is no inbetween). Let final result be the rounded average of the five attempts, the final probability of being right should be 63% not 85%.
The probability of making 3 right guesses and then a wrong guess is P(TTTFF)=.57^3*(1-.57)^2=3.42% Since order doesn’t matter we use nCr to multiply this value by all the different orderings of the letters. (5 C 3)=5!/(3!2!)=10. So the probability of making exactly 3 right attempts is 103.42%=34.2%. Similarly for P(TTTTF)=.57^4*(1-.57)=4.54% and there are obviously 5 ways to do this so the total probability of this is 22.69%. The probability of getting all 5 guesses right is just .57^5=6.02% and there is only one way to do this. Since the rounded average will be true if any of these ways happens we add the probabilities together to get 63%. Where did the 85% number come from?
Just wait a few millions years, find a computer with an IQ of 8,000, then tell it what to do.
You’d think that people would figure out how to properly redact things. Alas, we have stories like this: https://www.schneier.com/blog/archives/2005/05/pdf_radacting_f.html
Then there’s the story of Swirl Face…
I think it’s cheating to use J.K. Simmons face to test the system. It’s just so distinct and recognizable, even when heavily pixelated!
Also, once you have finished training your neural network, it will always give the same result on the same input. Maybe “chances” means sets of input data? I’m not sure.
This principle is why doing an absurd gravelly voice won’t keep your identity secret either, Bruce Wayne.
Didn’t that go out the window with being one of the few people with the resources to be batman, having the same height and build as batman and not having an alibi whenever batman shows up?
Totally different:
Also, Clark Kent’s image is flipped. Clark Kent parts on his right. Superman parts on his left. There’s a whole This American Life segment about it.
Sooo, it’s not really descrambling?
Sooo, the old “enhance” trick seen in Hollywood movies is still bunk?
This is lame. Give a human the pixelated images and it can probably do a better job of telling you which is the original specimen.