Comparing the Performance of Winsorize Tree to Other Data Mining Techniques for Cases Involving Outliers
Chee Keong Ch’ng
Chee Keong Ch’ng, School of Quantitative Sciences, College Arts and Sciences, Universiti Utara Malaysia, Sintok, Kedah, Malaysia.
Manuscript received on 26 June 2019 | Revised Manuscript received on 14 July 2019 | Manuscript Published on 26 July 2019 | PP: 197-201 | Volume-8 Issue-2S2 July 2019 | Retrieval Number: B10360782S219/2019©BEIESP | DOI: 10.35940/ijrte.B1036.0782S219
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Winsorize tree is a modified tree that reformed from classification and regression tree (CART). It lays on the strategy of handling and accommodating the outliers simultaneously in all nodes while generating the subsequence branches of tree. Normally, due to the existence of outlier, the accuracy rate of most of the classifiers will be affected. Therefore, we propose winsorize tree which could resist to anomaly data. It protects the originality of the data while performing the splitting process. In this study, winsorize tree was compared to other classifiers. The results obtained from five real datasets indicate that the proposed winsorize tree performs as good as or even better compare to the other data mining techniques based on the misclassification rate.
Keywords: Winsorize Tree Algorithm; Outlier; Gini Index; Misclassification Rate; Classification; Classification and Regression Tree; Winsorized Tree.
Scope of the Article: Data Mining