A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated withk-Fold Cross-Validation

Inan, Onur; Uzer, Mustafa Serter

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated withk-Fold Cross-Validation

dc.contributor.author	Inan, Onur
dc.contributor.author	Uzer, Mustafa Serter
dc.date.accessioned	2024-02-23T14:00:04Z
dc.date.available	2024-02-23T14:00:04Z
dc.date.issued	2021
dc.department	NEÜ	en_US
dc.description.abstract	Non-system errors that occur during data entry or data collection create noisy data that reduce the success of classification systems. To eliminate this data, a classification system with a new data reduction method consisting of a modifiedk-means algorithm using relief algorithm coefficients named MKMA-RAC was developed. The main theme of this article is the elimination of noisy data and its consistent application to the classification system using thek-fold cross-validation method. By means of the developed system, the training data became free from noisy data by integrating the support vector machine, linear discriminant analysis (LDA) and decision tree classifiers with MKMA-RAC-based data reduction for every fold. The data reduction process was not applied for the test data. Datasets used in the proposed method were the Hepatitis, Liver Disorders, SPECT images and Statlog (Heart) dataset taken from the UCI database. Classification performance values obtained both from the proposed method and without the proposed method with tenfold CV were given for these datasets. For Hepatitis, Liver Disorders, SPECT images and Statlog (Heart) datasets, and classification successes of the proposed system with SVM classifier were 96.88%, 74.56%, 87.24%, and 90.00%, classification successes of the proposed system with LDA classifier were 94.91%, 69.05%, 82.38%, and 88.52%, classification successes of the proposed system with decision tree classifier were 96.25%, 77.73%, 88.77% and 89.63%, respectively. The test results have shown that the proposed system generally achieved higher classification performance than other literature results. Therefore, the performance is very encouraging for pattern recognition applications.	en_US
dc.description.sponsorship	Necmettin Erbakan University; Selcuk University Scientific Research Projects Coordinatorship	en_US
dc.description.sponsorship	The authors are grateful to Necmettin Erbakan University and Selcuk University Scientific Research Projects Coordinatorship for support of the manuscript.	en_US
dc.identifier.doi	10.1007/s13369-020-04972-y
dc.identifier.endpage	1212	en_US
dc.identifier.issn	2193-567X
dc.identifier.issn	2191-4281
dc.identifier.issue	2	en_US
dc.identifier.scopus	2-s2.0-85091735926	en_US
dc.identifier.scopusquality	Q1	en_US
dc.identifier.startpage	1199	en_US
dc.identifier.uri	https://doi.org/10.1007/s13369-020-04972-y
dc.identifier.uri	https://hdl.handle.net/20.500.12452/11444
dc.identifier.volume	46	en_US
dc.identifier.wos	WOS:000574137000002	en_US
dc.identifier.wosquality	Q3	en_US
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Scopus	en_US
dc.language.iso	en	en_US
dc.publisher	Springer Heidelberg	en_US
dc.relation.ispartof	Arabian Journal For Science And Engineering	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Clustering-Based Data Elimination	en_US
dc.subject	Relief	en_US
dc.subject	Medical Dataset Classification	en_US
dc.title	A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated withk-Fold Cross-Validation	en_US
dc.type	Article	en_US

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated withk-Fold Cross-Validation

Dosyalar

Koleksiyon