A Machine Learning Access for Selectionof Influential Variables of Several ITK Inhibitors using Regression Research
Rama Devi Chalasani1, Radhika Y2
1Rama Devi Chalasani, Reaseach Scholar, Department of CSE, GIT, Gitam Deemed to be University, Visakhapatnam (A.P), India.
2Dr. Radhika Y, Professor, Department of CSE, GIT, Gitam Deemed to be University, Visakhapatnam (A.P), India.
Manuscript received on 25 August 2019 | Revised Manuscript received on 11 September 2019 | Manuscript Published on 17 September 2019 | PP: 1867-1875 | Volume-8 Issue-2S8 August 2019 | Retrieval Number: B11710882S819/2019©BEIESP | DOI: 10.35940/ijrte.B1171.0882S819
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Introduction: Interleukin-2 inducible T-cell kinase (ITK) is a tyrosine kinase expressed in T-cells, NK cells and mast cells. Selective ITK inhibitors act as an immunosuppressive and anti-inflammatory agent reduces lung inflammation, eosinophil infiltration, and mucous production in response to induced allergic asthma. Methodology: A dataset of 142 ITK inhibitors as dependent variables with 32 properties of compounds as explanatory variables were studied for their multicollinearity prior multivariate regression analysis. After data normalization, an inter-correlation cutoff value of 0.75 resulted in 15 variables and regression analysis resulted in 0.641 r2 and 0.598 adjusted r2 with RMSE 0.634 respectively. As the statistical parameters are within the limits, outlying data was investigated. Results: The standardized residual analysis resulted in nine data points and a new regression model is attempted with n=133 and p=15 reported improves statistics. Further, stepwise and stepwise AIC regression followed by variance inflation factor analyzed on the dataset revealed only 7 variables as important in defining inhibitory activity of ITK. Permutation and combinations of 7 variables resulted in r2 value >0.6 for 5, 6 and 7 variables. Hence, to select the best model, FIT criterion was employed where a 5-variable model was judged as best model. Conclusion: Finally, it has been emphasized that increase in HOMO, H-Bond Donors and shape index with a concomitant decrease in number of phenyl groups and LUMO parameter favors ITK inhibition.
Keywords: Regression, Multicollinearity, FIT Kubinyi Function, Outliers, ITK.
Scope of the Article: Machine Learning