Reading the article, it looks like Amazon have used entirely the wrong approach to designing this system. The description of the program “learning” patterns makes it sound like they have used a neutral network and trained it on their current decisions.
That’s one of the infuriating problems with big data analysis- neural networks are known to have issues that make them unsuitable for several tasks, but these lessons have constantly to be relearned as more and more people jump on the Big Data bandwagon.
Specifically, the issues with NN in this instance are:
- Opacity- it’s inherently difficult to tell what parameters the network is optimised for.
- Known biases in the training sample.
In the fields that have been using data modeling the longest (pharma and credit analysis) , there’s a wealth of data and experience in how to deal with these problems. For instance, a neural network would be impossiblel to get approval for use in those fields unless the model drivers could be shown not to be illegally discriminatory, and they would have to be updated and recalibrated as part of a controlled process, not allowed to “evolve” as this one at Amazon seems to.
And on the second point- where the training sample is biased in a known way, there are methods to solve that as well, such as Reject inference and reference:
Seeing problems that have solutions still cropping up like this is sheer stupidity.