Two years later, Google solves 'racist algorithm' problem by purging 'gorilla' label from image classifier


What is it that makes distinguishing black faces as human more difficult for the computer than doing so with white faces?


believe you are not racist. Most of us carry around the traces of our societies’ racist pasts. Most of us never examine those (still quite powerful) traces. So there probably isn’t such a thing as


Yes and no.

There are obviously no 100% ‘completely rational’ people; as humanity is an inherently irrational species.

There are, however, tons of people who are ‘not racist’; because racism is a system which requires power and influence, and many folks like myself have neither.

That said, there are no people who are ‘completely free’ from prejudice or bigotry of some sort. Whether we know it or not, we are all biased in some way shape or form; we all have “low contracted prejudices.”


Then good luck living in a 100% perfect utopia. Which sadly doesn’t exist.

It is like trying to legislate that pi equals 3. You can do it but it still won’t change anything on the fact that the real world doesn’t work like that.


If this shit was not “fucking hard”, don’t you think it wouldn’t need enormous datacenters, petabytes of training data, bunch of PhDs and it still doesn’t work reliably?

But if you have a solution, I am sure Google will be very interested in hiring you.

FYI, image understanding and classification are one of the hardest AI problems in existence.


Misdirection attempt failed; that’s not what I said nor even implied, and you can go ahead and stop addressing me now. Good day.


No, I didn’t say the job of automatic labeling wasn’t fucking hard. I said the solution to the problem of “misclassifying photos of blacks” wasn’t hard. Scrap the shitty work they’ve done and start over, and in the meantime HIRE PEOPLE to do the input MANUALLY.

Not hard. See? Where’s my check from Google?

I repeat: automating everything in the world is a fucking failure out of the box and they need to stop thinking they can invent elegant algorithms that will solve these problems, and then relying on shitty versions of their fondest wet dreams that utterly fail at the task they were created for. Stop deploying broken crap, hire people TO DO THE WORK until you get it right.


If I described a person as “black” I usually mean they are in a racial group we describe as “black”. If I describe a comedy as “black” I probably mean it contains humour about death or other grim subjects. If I describe a car as “black” I probably mean it’s the colour black. I’m going forward on the assumption this isn’t confusing for you.

You don’t have any trouble understanding that adjectives might have different meaning when applied to one thing than they do when they are applied to another. Stop going out of your way to have trouble understanding that calling a person racist means a different thing than calling an action racist which means a different thing than calling an institution racist which means a different thing than calling an algorithm racist.


Which is what I have alluded to as well when speaking about the training samples. On the other hand, how many pictures of gorillas do you have available? If you want to make sure that the algorithm can reliably distinguish between a dark skinned person and a gorilla (or some other potentially offensive label), you need to have them represented in comparable amounts in the training corpus.

See the problem? Yes, you are correct - theoretically. In practice these systems are limited by the availability of good quality, labeled training data. Since there isn’t going to be much data from Sub-Saharan Africa to train on, should we forbid this type of work lest someone gets offended because of an occasional unfortunate image labeling mistake?

This is not the same category of “bug” as e.g. the PredPol predictive policing system that sends cops into predominantly black and minority quarters much more often than to white ones or the sentencing systems used by courts to predict the chances of re-offending - again heavily stacked against minorities. There your argument would be a lot more meaningful because those systems don’t do anything else but whitewash the pre-existing and well documented institutional and personal biases encoded in the data they were trained on.

If you want to pick up the pitchforks and torches this should be where it would actually be helpful. Crucifying Google because their image classifier isn’t perfect and we are eager to stick the institutional racism label to it won’t help to improve it.


When someone starts out their declarative statement with a disclaimer, that’s a signifier…


You have obviously no idea whatsoever about the volumes of data used for these systems. Manual labeling can be done for a small scale classifier with a few classes but you will never be able to manually label millions of images that the system Google is using deals with.

Or pay for the army of people to do that labeling for you …


This post was flagged by the community and is temporarily hidden.



Perhaps I should have put quotes around “totally reasonable and not racist”, rather that italics.

In the current cultural climate, the only people who openly admit to racist structures and thought patterns in their world view are pretty much white supremacists and Nazis.

For everyone else, any label that smacks of racism transcends mere observation and becomes an ad hominem insult.

Which is to say, the fact that the specter of racism is so bad, most people refuse to recognize it in themselves, no matter how minute a part those influences play in their lives and thinking.

It’s a deep seated problem, and the fact that most people (incorrectly) identify as totally reasonable and not racist does more harm to the conversation than good.


Good question!

It’s demographics. Far more white faces being uploaded means the system (which I understand to be trained by input, not explicitly programmed with morphological types) will be better at categorizing them.

One would expect the system to be best at recognizing white folks, much worse at recognizing black folks, and horrible at recognizing gorillas. Because demographics.

If you were to go to the nearest gorilla sanctuary and take lots of pictures of gorillas in every possible pose, and upload them all to Google, you would be helping. If you upload lots of pictures of black people, that helps too but quite a bit less. If you upload white people you’re not helping.


And so, on we go to the great big human mistake: one would expect a company not to initialize such a system within its offerings until the bugs that make it de facto racist have been eradicated.


“Pitchforks and torches”? “Crucifying”? Where are you getting that from? Not from my comment, unless I’ve had some kind of fugue and written (and subsequently deleted) some pretty inflammatory stuff.

You said that this problem couldn’t be racism because there was no intentional malice. I said that intentional malice was not a necessary precondition for racism. If you have an argument that refutes that, please make it.


Unfortunately it’s one of those things that is obvious in retrospect, not so much when you are inventing cutting edge technology and improving human connectivity and spreading educational opportunities.

And it’s also difficult to predict how racists and pseudo-racist griefers will attack new technologies, too, as witness Microsoft’s Nazi AI problems.

Speaking as a person with some small knowledge of technology, I’m not surprised or appalled that Google encountered this problem, and glad that users called it to their attention. I am disappointed by their late and lame solution, though. I think they have the resources to do better.


Same same.

But it may well also be a result of a general whitened perspective at Google. It’s like no one on staff had heard of the insights offered about de facto racist technology over 20 years ago, by Richard Dyer and many after him.


Well this is the issue. Usually you can break down a problem into more than one thing. Like that intersection where drivers couldn’t see cyclists that we recently highlighted on BB. The fact that the roads met at that angle is an accident. The fact that no one knew this was going to be a problem isn’t surprising. It may even be reasonable for the problem to go unnoticed for a long time (people who know people who die there don’t necessarily communicate with other people who know people who die there / even if the deaths are too frequent they may still be infrequent / large systems to track this sort of thing are a fairly recent idea).

But at some point you cross a threshold where you ought to know there is a problem. At some point you cross a threshold where you should have done something about the problem.

For those who like long stories, you can read one

Someone I know recently went to a presentation about policing in a small-ish city. The presentation had demographics in it. The demographics showed there were no indigenous people living in the city. They were asked about this (it didn’t seem plausible) and said that many indigenous people live in surrounding communities, but there were none in the city.

The city is about 5% indigenous people.

Someone looked up statistics Canada stats on “visible minorities” to make a chart. It turns out that indigenous is not a “visible minority” in statscan stats (there are other stats where it shows up, but it is explicitly not tracked in that category).

So I get how this happened, but I see a lot of failures. First they should have someone who understands the stats they use better doing the charts. But this is administration staff for police, I don’t expect them to be statisticians, so I’m not too livid about this one.

But then no one said, hey wait, doesn’t our city have indigenous people living in it? They have a racist mindset that erases indigenous people.

Finally, when it was challenged, the person who received the challenge was more sure of their statistics than of their actual life experience where they have nearly certainly interacted with indigenous people living in the city. (Or more insistent on saving face than on being right) They were able to justify erasing indigenous people in their analysis.

I think Google made algorithms and reasonably was unable to predict how they would function. I think, though, that it’s clear that at some stage of testing their algorithms before putting them into use, there were people who (most likely because of unconscious biases) simply didn’t think anything like, “It’s important that we test this on a very diverse group of images of people.”

So I think they should have done better on the initial classifier, not just on their solution to it. It’s calling black people gorillas. Is it also misclassifying people with mobility devices? People with down syndrome?

The whole thing makes me think a bit about:

They need to think of better lenses through which to view their products before they send them out the door.