Introduction to Dimensionality Reduction for Machine Learning

Last Updated on June 30, 2020

The number of input variables or features for a dataset is referred to as its dimensionality.

Dimensionality reduction refers to techniques that reduce the number of input variables in a dataset.

More input features often make a predictive modeling task more challenging to model, more generally referred to as the curse of dimensionality.

High-dimensionality statistics and dimensionality reduction techniques are often used for data visualization. Nevertheless these techniques can be used in applied machine learning to simplify a classification or regression dataset in order to better fit a predictive model.

In this post, you will discover a gentle introduction to dimensionality reduction for machine learning

After reading this post, you will know:

  • Large numbers of input features can cause poor performance for machine learning algorithms.
  • Dimensionality reduction is a general field of study concerned with reducing the number of input features.
  • Dimensionality reduction methods include feature selection, linear algebra methods, projection methods, and autoencoders.

Kick-start your project with my new book Data Preparation for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

  • Updated May/2020: Changed section headings to be more accurate.
To finish reading, please visit source site