ODBOT: Outlier detection-based oversampling technique for imbalanced datasets learning
Küçük Resim Yok
Tarih
2021
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Springer London Ltd
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
In many real-world problems, the datasets are imbalanced when the samples of majority classes are much greater than the samples of minority classes. In general, machine learning and data mining classification algorithms perform poorly on imbalanced datasets. In recent years, various oversampling techniques have been developed in the literature to solve the class imbalance problem. Unfortunately, few of the oversampling techniques can be spread to tackle the relationship between the classes and use the correlation between attributes. Moreover, in most cases, the existing oversampling techniques do not handle multi-class imbalanced datasets. To this end, in this paper, a simple but effective outlier detection-based oversampling technique (ODBOT) is proposed to handle the multi-class imbalance problem. In the proposed ODBOT, the outlier samples are detected by clustering within the minority class(es), and then, the synthetic samples are generated by consideration of these outlier samples. The proposed ODBOT generates very efficient and consistent synthetic samples for the minority class(es) by analyzing well the dissimilarity relationships among attribute values of all classes. Moreover, ODBOT can reduce the risk of the overlapping problem among different class regions and can build a better classification model. The performance of the proposed ODBOT is evaluated with extensive experiments using commonly used 60 imbalanced datasets and five classification algorithms. The experimental results show that the proposed ODBOT oversampling technique consistently outperformed the other common and state-of-the-art techniques in terms of various evaluation criteria.
Açıklama
Anahtar Kelimeler
Class Imbalance Dataset, Data Preprocessing, Outlier Detection, Oversampling
Kaynak
Neural Computing & Applications
WoS Q Değeri
Q2
Scopus Q Değeri
Q1
Cilt
33
Sayı
22