Predictive Model for the Phoneme Imbalanced Classification Dataset

Last Updated on August 21, 2020

Many binary classification tasks do not have an equal number of examples from each class, e.g. the class distribution is skewed or imbalanced.

Nevertheless, accuracy is equally important in both classes.

An example is the classification of vowel sounds from European languages as either nasal or oral on speech recognition where there are many more examples of nasal than oral vowels. Classification accuracy is important for both classes, although accuracy as a metric cannot be used directly. Additionally, data sampling techniques may be required to transform the training dataset to make it more balanced when fitting machine learning algorithms.

In this tutorial, you will discover how to develop and evaluate models for imbalanced binary classification of nasal and oral phonemes.

After completing this tutorial, you will know:

  • How to load and explore the dataset and generate ideas for data preparation and model selection.
  • How to evaluate a suite of machine learning models and improve their performance with data oversampling techniques.
  • How to fit a final model and use it to predict class labels for specific cases.

Kick-start your project with my new book Imbalanced Classification with Python, including step-by-step tutorials and
To finish reading, please visit source site