On the Stability of Machine Learning Models: Measuring Model and Outcome Variance
Volume 18, No. 2, 2020
Vasant Dhar and Haoyuan Yu
How do you know how much you should trust a model that is learned from data? We propose that a central criterion in measuring trust is the decision-making variance of a model. We call this “model variance.” Conceptually, it refers to the inherent instability machine learning models experience in their decision-making in response to variations in the training data. We report the results from a controlled study that measures model variance as a function of (1) the inherent predictability of a problem and (2) the frequency of the occurrence of the class of interest. The results provide important guidelines for what we should expect from machine learning methods for the range of problems that vary across different levels of predictability and base rates, thereby making the results of general scientific interest.