A New Evaluation Measure for Imbalanced Datasets

Weng, C.G. and Poon, J.

    The area of imbalanced datasets is still relatively new, and it is known that the use of overall accuracy is not an appropriate evaluation measure for imbalanced datasets, because of the dominating e ect of the majority class. Although, researchers have tried other existing measurements, but there is still no single evaluation measure that work well with imbalanced dataset. In this paper, we introduce a novel measure as a better alternative for evaluating imbalanced dataset. We provide a theoretical background for the new evaluation technique that is designed to cope with cost biases, which changes the previous view about class independent evaluation methods cannot deal with costs, such as ROC curves. We also provide a general guideline for the ideal baseline performance when building classi ers with a known misclassi cation cost.
Cite as: Weng, C.G. and Poon, J. (2008). A New Evaluation Measure for Imbalanced Datasets. In Proc. Seventh Australasian Data Mining Conference (AusDM 2008), Glenelg, South Australia. CRPIT, 87. Roddick, J. F., Li, J., Christen, P. and Kennedy, P. J., Eds. ACS. 27-32.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS