Assessing and Comparing Classifier Performance with ROC Curves

Last Updated on March 5, 2020

The most commonly reported measure of classifier performance is accuracy: the percent of correct classifications obtained.

This metric has the advantage of being easy to understand and makes comparison of the performance of different classifiers trivial, but it ignores many of the factors which should be taken into account when honestly assessing the performance of a classifier.

What Is Meant By Classifier Performance?

Classifier performance is more than just a count of correct classifications.

Consider, for interest, the problem of screening for a relatively rare condition such as cervical cancer, which has a prevalence of about 10% (actual stats). If a lazy Pap smear screener was to classify every slide they see as “normal”, they would have a 90% accuracy. Very impressive! But that figure completely ignores the fact that the 10% of women who do have the disease have not been diagnosed at all.

Some Performance Metrics

In a previous blog post we discussed some of the other performance metrics which can be applied to the assessment of a classifier. To review:

Most classifiers produce a score, which is then thresholded to decide the classification. If a classifier produces a
To finish reading, please visit source site