Towards a method for fixing machine learning's persistent and catastrophic blind spots

Originally published at:


It reminds me of how small children often learn that a boy is a person with short hair, and a girl is a person with long hair… Over many years their training set grows large enough to make a better classifier…


Or that cats are small animals with pointy ears and dogs larger animals with droopy ears – many children upon seeing a Chihuahua for the first time think it is a cat – and so do many image recognition algorithms.


Will this make computers less racist, though?


Yeah, my son clashed repeatedly with the class bully over that for a couple of months, because he was convinced that the lad in question was a girl based upon his undercut hairstyle and high-pitched voice.

I have been trying to educate him on using people’s preferred pronouns, and he doesn’t really bat an eye at trans people, but had to misgender the kid that swears fights, and bites, didn’t he…


Most “AI” is just decision trees coded by people who basically pick domain knowledge off the top of their heads, so of course its rubbish. As a bike rider I am deeply suspicious of the algorithms behind self driving cars.


Wait till you see what they’ll come up with for self-driving (self-riding?) bikes. :wink:

1 Like

More and more I’m convinced that machine learning and the underlying algorithms are useless wastes of our time and money and should not in any form be used in production systems. In no near future will machines be able to make the type of logical leaps that humans can take to instantly modify an existing set of ideas to incorporate new data. The machines simply mislabel and misclassify and go on about their business, and only when an egregious example occurs do the researchers revisit and try to fix (by recoding and doing new data sets, which introduce all new problems). And tech companies are using these to try and automate their systems, much to our detriment, so they can increase their profitability by not hiring folks to do the hard work these systems are clearly so poor at doing.

I love technology and futurism, but this path isn’t panning out and it’s becoming an excuse to allow shitty things to happen. “Darn, sorry those minorities got singled out, arrested, and sent to jail. Not our fault, it was the algorithm. Oh well, we’ll rewrite it.”


The word “ROBUST” really caught my eye - I remember learning regression equations for statistics and the professor gave us two methods that we could use. He taught us one of them, then said the next one was more ROBUST. When I pressed him for what ROBUST meant, he said it just was that - it was more ROBUST.

It was frustrating because it was a descriptor that didn’t match the object - calling an equation ROBUST is like calling it GREEN. I had/have no idea what a ROBUST equation is - would be interested if anyone out there can do a better job…


Which - I think it deserves mention - would be the least problematic aspect of an AI project with this particular objective…

1 Like

ROBUST typically means that the method chosen is less susceptible to providing incorrect results with data outside of the typical.

Here’s a snippet that might make sense (specifically on ROBUST regression):

When fitting a least squares regression, we might find some outliers or high leverage data points. We have decided that these data points are not data entry errors, neither they are from a different population than most of our data. So we have no compelling reason to exclude them from the analysis. Robust regression might be a good strategy since it is a compromise between excluding these points entirely from the analysis and including all the data points and treating all them equally in OLS regression. The idea of robust regression is to weigh the observations differently based on how well behaved these observations are. Roughly speaking, it is a form of weighted and reweighted least squares regression.

I’ll also include a snippet from the MIT whitepaper itself, where they define the term as they intend to use it:

We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data.

Basically, in this context, “robustness” is a generic term for “will the computer provide unsurprising-to-humans results, given a variety of inputs.”


I don’t know how near that future will be, but I do know that humans are usually pretty bad at modifying existing sets of ideas to incorporate new data: that’s cognitive dissonance, and as an example I give you the convinced Trump voter.

And yet, on the whole, humans are more useful than not, and the same is becoming ever-more true of machine-learning.

There are a lot of ways to get more robust results. Sometimes statisticians use medians instead of means to reduce the impact of outliers. Another approach is called shotgunning where the analysis is run with a variety of subsets of the sample data and converging on a characterization that doesn’t depend on a particular selection. That latter approach might improve machine learning, but not as much as actually reintroducing some semantic analysis as Rodney Brooks often points out.

1 Like

The key part that I think is missing, which ML implicitly hopes will arise from the dust, is an internal “mental” model of the world, its objects and their (relevant) behaviours. ML as practised is nearly always reductive, just fitting input to output, with little scope for generating implied models of behaviour in between. It also all-to-often discards (as this paper implies, then tries to recover) a lot of stats along the way, usually with the strawman argument that the practitioner couldn’t fit a linear model (shame on those using that argument).

When you see a picture of a dog you don’t just label it “dog”. You have in your mind a whole range of doggy behaviours, including sitting for a photo, maybe a dog you knew that looked like the picture, which help you make that identification. You are less likely to be confused by a cat picture in no small part, I think, because of the additional information you have about how cats sit for pictures (generally poorly).

The current surge in ML is the “infinite number of monkeys” approach. Loading a bunch of libraries into Python and throwing data into them while burning copious CPU/GPU time is relatively easy. Real models are hard. Capacity is, in the end, well short of the required “infinite”.

ML has the “quantity has a quality all its own” going for it; that’s about it.


Oh, and this is my current favourite “logical leap” challenge for ML people:

Teach your AI about real number arithmetic, exponentiation, etc. Now: discover complex numbers.

This usually leads to flustered expressions and “oh, well, that’s real intelligence”.

But without complex numbers, we miss out on the very world of technology which pretends to be AI.

1 Like

This topic was automatically closed after 5 days. New replies are no longer allowed.