Cardio-Vascular Disease Prediction based on Ensemble technique enhanced using Extra Tree Classifier for Feature Selection
Baranidharan B1, Abhisikta Pal2, Preethi Muruganandam3

1Baranidharan B, Associate Professor, Department of CSE, SRM IST, Chennai.
2Abhisikta Pal, UG student, Department of CSE, SRM IST, Chennai.
3Preethi Muruganandam, UG Student, Department of CSE, SRM IST, Chennai. 

Manuscript received on 3 August 2019. | Revised Manuscript received on 11 August 2019. | Manuscript published on 30 September 2019. | PP: 3236-3242 | Volume-8 Issue-3 September 2019 | Retrieval Number: C5404098319/2019©BEIESP | DOI: 10.35940/ijrte.C5404.098319
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Cardio vascular disease is the major death factor in the last decade. Most of the patients diagnosed with the CVD at the later stage die even though advanced treatments are given. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse for the large number of people to the early fix of the disease. Machine learning and Data mining techniques are successful in medical diagnosis through non-invasive methods. In developing such models, Feature selection is very important since it affects the accuracy of the diagnosis. In this research, the feature selection is done through Extra Tree classifier method for identifying the most important feature combination for predicting the heart disease. Cleveland and Statlog datasets are used for developing and testing the model. Base classifiers such as Support Vector Machine (SVM), K Nearest Neighbour (KNN), Decision Tree, Logistic Regression, Naïve Bayes and Vote are tested over all original 13 features from datasets, 9 feature combination and 6 feature combination. It is observed that Vote classifier using 9 and 6 feature combination gives the best accuracy and F1 score.
Keywords: Cardio-Vascular Disease, Feature Selection, Support vector Machine, Ensemble Classifiers, Naïve Bayes

Scope of the Article:
Regression and Prediction