I’ve posted this before, but it gets at the heart of the problem. It’s Zoubin Ghahramani talking about the “big lie” in machine learning, specifically that the set of training data spans the set of test data. In a sense, everything would be fine if we could guarantee that the big lie was actually true, but ML would be of limited use.
He talks about Bayesian methods as being useful in this area, and it’s not hard to see how. Essentially, the Bayesian approach can be used to put prior expectations about how deviations from test data might affect the result, and if the deviations are too big, then the system becomes very uncertain (as in, it knows it doesn’t know), which is an acceptable result. If these prior expectations are based on solid principles (e.g. objects have limits on how fast they can accelerate), then you can end up with a very well regularised system, and significant robustness to weird inferences. The problem then becomes one of translating common sense into prior stastical knowledge (and then of course doing the inference properly, which is probably non-trivial).
This talk by Ali Rahimi is also well worth a view - he discusses (rather controversially as it turns out) how much of machine learning can be compare to alchemy, in that lots of people are doing things without understanding quite why, but achieving useful things in the process. He argues that researchers need to get a handle on what makes these systems actually tick.
1 Like