A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated withk-Fold Cross-Validation

dc.contributor.authorInan, Onur
dc.contributor.authorUzer, Mustafa Serter
dc.date.accessioned2024-02-23T14:00:04Z
dc.date.available2024-02-23T14:00:04Z
dc.date.issued2021
dc.departmentNEÜen_US
dc.description.abstractNon-system errors that occur during data entry or data collection create noisy data that reduce the success of classification systems. To eliminate this data, a classification system with a new data reduction method consisting of a modifiedk-means algorithm using relief algorithm coefficients named MKMA-RAC was developed. The main theme of this article is the elimination of noisy data and its consistent application to the classification system using thek-fold cross-validation method. By means of the developed system, the training data became free from noisy data by integrating the support vector machine, linear discriminant analysis (LDA) and decision tree classifiers with MKMA-RAC-based data reduction for every fold. The data reduction process was not applied for the test data. Datasets used in the proposed method were the Hepatitis, Liver Disorders, SPECT images and Statlog (Heart) dataset taken from the UCI database. Classification performance values obtained both from the proposed method and without the proposed method with tenfold CV were given for these datasets. For Hepatitis, Liver Disorders, SPECT images and Statlog (Heart) datasets, and classification successes of the proposed system with SVM classifier were 96.88%, 74.56%, 87.24%, and 90.00%, classification successes of the proposed system with LDA classifier were 94.91%, 69.05%, 82.38%, and 88.52%, classification successes of the proposed system with decision tree classifier were 96.25%, 77.73%, 88.77% and 89.63%, respectively. The test results have shown that the proposed system generally achieved higher classification performance than other literature results. Therefore, the performance is very encouraging for pattern recognition applications.en_US
dc.description.sponsorshipNecmettin Erbakan University; Selcuk University Scientific Research Projects Coordinatorshipen_US
dc.description.sponsorshipThe authors are grateful to Necmettin Erbakan University and Selcuk University Scientific Research Projects Coordinatorship for support of the manuscript.en_US
dc.identifier.doi10.1007/s13369-020-04972-y
dc.identifier.endpage1212en_US
dc.identifier.issn2193-567X
dc.identifier.issn2191-4281
dc.identifier.issue2en_US
dc.identifier.scopus2-s2.0-85091735926en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.startpage1199en_US
dc.identifier.urihttps://doi.org/10.1007/s13369-020-04972-y
dc.identifier.urihttps://hdl.handle.net/20.500.12452/11444
dc.identifier.volume46en_US
dc.identifier.wosWOS:000574137000002en_US
dc.identifier.wosqualityQ3en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherSpringer Heidelbergen_US
dc.relation.ispartofArabian Journal For Science And Engineeringen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectClustering-Based Data Eliminationen_US
dc.subjectReliefen_US
dc.subjectMedical Dataset Classificationen_US
dc.titleA Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated withk-Fold Cross-Validationen_US
dc.typeArticleen_US

Dosyalar