Develop a Model for the Imbalanced Classification of Good and Bad Credit

Last Updated on August 28, 2020

Misclassification errors on the minority class are more important than other types of prediction errors for some imbalanced classification tasks.

One example is the problem of classifying bank customers as to whether they should receive a loan or not. Giving a loan to a bad customer marked as a good customer results in a greater cost to the bank than denying a loan to a good customer marked as a bad customer.

This requires careful selection of a performance metric that both promotes minimizing misclassification errors in general, and favors minimizing one type of misclassification error over another.

The German credit dataset is a standard imbalanced classification dataset that has this property of differing costs to misclassification errors. Models evaluated on this dataset can be evaluated using the Fbeta-Measure that provides a way of both quantifying model performance generally, and captures the requirement that one type of misclassification error is more costly than another.

In this tutorial, you will discover how to develop and evaluate a model for the imbalanced German credit classification dataset.

After completing this tutorial, you will know: