How to Use Statistical Significance Tests to Interpret Machine Learning Results

Last Updated on August 8, 2019

It is good practice to gather a population of results when comparing two different machine learning algorithms or when comparing the same algorithm with different configurations.

Repeating each experimental run 30 or more times gives you a population of results from which you can calculate the mean expected performance, given the stochastic nature of most machine learning algorithms.

If the mean expected performance from two algorithms or configurations are different, how do you know that the difference is significant, and how significant?

Statistical significance tests are an important tool to help to interpret the results from machine learning experiments. Additionally, the findings from these tools can help you better and more confidently present your experimental results and choose the right algorithms and configurations for your predictive modeling problem.

In this tutorial, you will discover how you can investigate and interpret machine learning experimental results using statistical significance tests in Python.

After completing this tutorial, you will know:

  • How to apply normality tests to confirm that your data is (or is not) normally distributed.
  • How to apply parametric statistical significance tests for normally distributed results.
  • How to apply nonparametric statistical significance tests for
    To finish reading, please visit source site