EF_Unique: An Improved Version of Unsupervised Equal Frequency Discretization Method
Küçük Resim Yok
Tarih
2018
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Springer Heidelberg
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Discretization is an important data preprocessing technique used in data mining and knowledge discovery processes. The purpose of discretization is to transform or partition continuous values into discrete ones. In this manner, many data mining classification algorithms can be applied the discrete data more concisely and meaningfully than continuous ones, resulting in better performance. In this study, an improved version of the unsupervised equal frequency (EF) discretization method, EF_Unique, is proposed for enhancing the performance of discretization. The proposed EF_Unique discretization method is based on the unique values of the attribute to be discretized. In order to test the success of the proposed method, 17 benchmark datasets from the UCI repository and four data mining classification algorithms were used, namely Naive Bayes, C.45, k-nearest neighbor, and support vector machine. The experimental results of the proposed EF_Unique discretization method were compared with those obtained using well-known discretization methods; unsupervised equal width (EW), EF, and supervised entropy-based ID3 (EB-ID3). The results show that the proposed EF_Unique discretization method outperformed EW, EF, and EB-ID3 discretization methods in 43, 41, and 27 out of the 68 benchmark tests, respectively.
Açıklama
Anahtar Kelimeler
Classification Algorithms, Data Mining, Supervised Discretization, Unsupervised Discretization
Kaynak
Arabian Journal For Science And Engineering
WoS Q Değeri
Q3
Scopus Q Değeri
Q1
Cilt
43
Sayı
12