EF_Unique: An Improved Version of Unsupervised Equal Frequency Discretization Method

Küçük Resim Yok

Tarih

2018

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Springer Heidelberg

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Discretization is an important data preprocessing technique used in data mining and knowledge discovery processes. The purpose of discretization is to transform or partition continuous values into discrete ones. In this manner, many data mining classification algorithms can be applied the discrete data more concisely and meaningfully than continuous ones, resulting in better performance. In this study, an improved version of the unsupervised equal frequency (EF) discretization method, EF_Unique, is proposed for enhancing the performance of discretization. The proposed EF_Unique discretization method is based on the unique values of the attribute to be discretized. In order to test the success of the proposed method, 17 benchmark datasets from the UCI repository and four data mining classification algorithms were used, namely Naive Bayes, C.45, k-nearest neighbor, and support vector machine. The experimental results of the proposed EF_Unique discretization method were compared with those obtained using well-known discretization methods; unsupervised equal width (EW), EF, and supervised entropy-based ID3 (EB-ID3). The results show that the proposed EF_Unique discretization method outperformed EW, EF, and EB-ID3 discretization methods in 43, 41, and 27 out of the 68 benchmark tests, respectively.

Açıklama

Anahtar Kelimeler

Classification Algorithms, Data Mining, Supervised Discretization, Unsupervised Discretization

Kaynak

Arabian Journal For Science And Engineering

WoS Q Değeri

Q3

Scopus Q Değeri

Q1

Cilt

43

Sayı

12

Künye