How to Calculate Bootstrap Confidence Intervals For Machine Learning Results in Python
Last Updated on August 14, 2020
It is important to both present the expected skill of a machine learning model a well as confidence intervals for that model skill.
Confidence intervals provide a range of model skills and a likelihood that the model skill will fall between the ranges when making predictions on new data. For example, a 95% likelihood of classification accuracy between 70% and 75%.
A robust way to calculate confidence intervals for machine learning algorithms is to use the bootstrap. This is a general technique for estimating statistics that can be used to calculate empirical confidence intervals, regardless of the distribution of skill scores (e.g. non-Gaussian)
In this post, you will discover how to use the bootstrap to calculate confidence intervals for the performance of your machine learning algorithms.
After reading this post, you will know:
- How to estimate confidence intervals of a statistic using the bootstrap.
- How to apply this method to evaluate machine learning algorithms.
- How to implement the bootstrap method for estimating confidence intervals in Python.
Kick-start your project with my new book Statistics for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.