Standard Machine Learning Datasets for Imbalanced Classification

Last Updated on January 14, 2020

An imbalanced classification problem is a problem that involves predicting a class label where the distribution of class labels in the training dataset is skewed.

Many real-world classification problems have an imbalanced class distribution, therefore it is important for machine learning practitioners to get familiar with working with these types of problems.

In this tutorial, you will discover a suite of standard machine learning datasets for imbalanced classification.

After completing this tutorial, you will know:

  • Standard machine learning datasets with an imbalance of two classes.
  • Standard datasets for multiclass classification with a skewed class distribution.
  • Popular imbalanced classification datasets used for machine learning competitions.

Kick-start your project with my new book Imbalanced Classification with Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Standard Machine Learning Datasets for Imbalanced Classification

Standard Machine Learning Datasets for Imbalanced Classification
Photo by Graeme Churchard, some rights reserved.

Tutorial Overview

This tutorial is divided into three parts; they are:

  1. Binary Classification Datasets
  2. Multiclass Classification Datasets
  3. To finish reading, please visit source site