How to Calculate Nonparametric Statistical Hypothesis Tests in Python

Last Updated on August 8, 2019

In applied machine learning, we often need to determine whether two data samples have the same or different distributions.

We can answer this question using statistical significance tests that can quantify the likelihood that the samples have the same distribution.

If the data does not have the familiar Gaussian distribution, we must resort to nonparametric version of the significance tests. These tests operate in a similar manner, but are distribution free, requiring that real valued data be first transformed into rank data before the test can be performed.

In this tutorial, you will discover nonparametric statistical tests that you can use to determine if data samples were drawn from populations with the same or different distributions.

After completing this tutorial, you will know:

  • The Mann-Whitney U test for comparing independent data samples: the nonparametric version of the Student t-test.
  • The Wilcoxon signed-rank test for comparing paired data samples: the nonparametric version of the paired Student t-test.
  • The Kruskal-Wallis H and Friedman tests for comparing more than two data samples: the nonparametric version of the ANOVA and repeated measures ANOVA tests.

Kick-start your project with my new book Statistics for Machine Learning,
To finish reading, please visit source site