An Effective Stratified K-Fold Algorithm with Logistic Regression for Drug Feedback Data
D. Naga Swathi1, Kumaran.U2
1D.Naga Swathi*, Department of Student Computer Science and Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India.
2Kumaran.U, Department of Computer Science and Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India.
Manuscript received on February 10, 2020. | Revised Manuscript received on February 20, 2020. | Manuscript published on March 30, 2020. | PP: 1964-1968 | Volume-8 Issue-6, March 2020. | Retrieval Number: F8166038620/2020©BEIESP | DOI: 10.35940/ijrte.F8166.038620
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Drug reviews are commonly used in pharmaceutical industry to improve the medications given to patients. Generally, drug review contains details of drug name, usage, ratings and comments by the patients. However, these reviews are not clean, and there is a need to improve the cleanness of the review so that they can be benefited for both pharmacists and patients. To do this, we propose a new approach that includes different steps. First, we add extra parameters in the review data by applying VADER sentimental analysis to clean the review data. Then, we apply different machine learning algorithms, namely linear SVC, logistic regression, SVM, random forest, and Naive Bayes on the drug review specify dataset names. However, we found that the accuracy of these algorithms for these datasets is limited. To improve this, we apply stratified K-fold algorithm in combination with Logistic regression. With this approach, the accuracy is increased to 96%.
Keywords: Vader (Valence Aware Dictionary for sentiment Reasoning), stratified K-fold, Machine Learning (ML), Drug Reviews, Natural Language Processing(NLP).
Scope of the Article: Machine Learning.