There aren’t that many gorillas left in the world. There are probably many more racist posts than actual pictures of gorillas. Now word is out that you can get Google to misdiagnose images, there are probably a lot more deliberately mislabeled images.
Google’s art is in getting the software to train itself. In their ocean of data there is no clear parameter you can change to identify people as gorillas. The best you can do is to pick up the cases identified as ‘gorilla’ and try to double-check them somehow. Meanwhile, your image database is getting contaminated, possibly by racists, but also by images from the original news story. And you are still making mistakes and getting complaints. Or you pull the ‘gorilla’ category and it stops.
If they can count a particular sort of mislabelled image, then it would be interesting to see whether there has been a shift from labelling ‘gorilla’ to ‘chimpanzee’ since this story broke. That would actually show racists at work.
Don’t their nets get trained by user input? I wouldn’t be at least surprised if malicious racist trollies tagged black skinned people as Gorillas just out of spite.
But isn’t that a distinction without a difference? Failing to test your software against a properly representative sample of faces is surely an example of unconscious bias.
Also, it should be pointed out the problem is not symmetrical. If object identification doesn’t work and 1 in 1,000 times I am mistaken for a Snowman or a Naked Mole Rat, I am going to find that humourous, not offensive. One in one thousand is an acceptable error rate.
When my entire race has undergone multiple centuries of oppression, and that oppression involves the very terms that the image identifier is using, then one in a million is an utterly unacceptable error rate.
If the technology is not up identifying people with 99.9999% accuracy (for any race), then the only sensible (and sensitive) thing to do is to remove objectionable terms altogether.
So if they remove the identifiers ‘gorilla’ etc, what will the system label pictures of actual primates as, then. once it has been ‘trained’ to ‘recognise’ dark-skinned people as people? I hesitate to suggest the obvious answer here, but if it were true then we can expect an even bigger almighty shitshow when that leaks out in due course.
Probably dumb question: Google Photos is different from Google image search? Is that an app of some kind? Because Google image search seems okay; I just did a search for “gorilla” and got lots of pictures of gorillas, with the only humans appearing in photos that also had gorillas in them.
Which raises the question, if Google image search can manage the task, why can’t this app do it? Or am I missing something?
Agreed. It’s a useful tool of classicism and a deeply rooted systemic problem… as perpetually evidenced by people tripping all over themselves to deny its very existence.
Writing the to-be-trained algorithm is creating it.
Choosing training data is creating it.
As has been mentioned here on the Boing repeatedly, bias can get into these programs not just from conscious biases on the part of their programmers, but also from unconscious prejudices.