How to Use Quantile Transforms for Machine Learning

Last Updated on August 28, 2020

Numerical input variables may have a highly skewed or non-standard distribution.

This could be caused by outliers in the data, multi-modal distributions, highly exponential distributions, and more.

Many machine learning algorithms prefer or perform better when numerical input variables and even output variables in the case of regression have a standard probability distribution, such as a Gaussian (normal) or a uniform distribution.

The quantile transform provides an automatic way to transform a numeric input variable to have a different data distribution, which in turn, can be used as input to a predictive model.

In this tutorial, you will discover how to use quantile transforms to change the distribution of numeric variables for machine learning.

After completing this tutorial, you will know:

  • Many machine learning algorithms prefer or perform better when numerical variables have a Gaussian or standard probability distribution.
  • Quantile transforms are a technique for transforming numerical input or output variables to have a Gaussian or uniform probability distribution.
  • How to use the QuantileTransformer to change the probability distribution of numeric variables to improve the performance of predictive models.

Kick-start your project with my new book Data Preparation for Machine Learning,
To finish reading, please visit source site