Tuned Random Forest Algorithm for Improved Prediction of Cardiovascular Disease
P.Nancy1, B.Swaminathan2, K.Navina3, B.Nandhine4, P.Lokesh5
1Dr.P.Nancy, Assistant Professor, Department of Computer Science and Engineering, Rajalakshmi Engineering College, Chennai India.
2Dr.B.Swaminathan, Professor, Department of Computer Science and Engineering, Rajalakshmi Engineering College, Chennai India.
3K.Navina, UG Scholars ,Department of Computer Science and Engineering, Rajalakshmi Engineering College, Chennai India.
4B.Nandhine, UG Scholars ,Department of Computer Science and Engineering, Rajalakshmi Engineering College, Chennai India.
5P.Lokesh, UG Scholars ,Department of Computer Science and Engineering, Rajalakshmi Engineering College, Chennai India.
Manuscript received on April 02, 2020. | Revised Manuscript received on April 21, 2020. | Manuscript published on May 30, 2020. | PP: 1355-1360 | Volume-9 Issue-1, May 2020. | Retrieval Number: A1599059120/2020©BEIESP | DOI: 10.35940/ijrte.A1599.059120
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Data mining is becoming more and more popular and essential in the field of medicine. The large amounts of data produced everyday by the medical industry are very complex and voluminous to be processed and analyzed by the usual traditional means. In such cases data mining comes into play. Despite the presence of several prediction algorithms, the efficiency is questionable due to the presence high error rate. Therefore it is necessary to choose a prediction algorithm that gives higher accuracy with fewer errors. The aim of this paper is to create a system for efficient and accurate prediction of cardiovascular disease. The datasets for the process is taken from UCI machine learning repository. The datasets are tested for accuracy using ANOVA technique. The algorithms are investigated using the WEKA tool. The best features for prediction are obtained from feature selection algorithms. Various classification algorithms are applied on the datasets to identify the most efficient algorithm. We observe that random forest gives consistently better accuracy than other algorithms. Tuning is done on the random forest algorithm to further improve the accuracy of prediction system.
Keywords: Random Forest, Data Mining, Classification Algorithms, WEKA, Tuning, Accuracy.
Scope of the Article: Classification