Loading

Sequence Based DNA-Binding Protein Prediction
Farisa T S1, Elizabeth Isaac2

1Farisa T S*, Department of Computer Science and Engineering, Mar Athanasius College of Engineering, Kothamangalam, Kerala, India.
2Dr. Elizabeth Isaac , Department of Computer Science and Engineering, Mar Athanasius College of Engineering, Kothamangalam, Kerala, India.

Manuscript received on February 03, 2021. | Revised Manuscript received on February 15, 2021. | Manuscript published on March 30, 2021. | PP: 44-48 | Volume-9 Issue-6, March 2021. | Retrieval Number: 100.1/ijrte.B3665079220 | DOI: 10.35940/ijrte.B3665.039621
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Protein and DNA have vital role in our biological processes. For accurately predicting DNA binding protein, develop a new sequence based prediction method from the protein sequence. Sequence based method only considers the protein sequence information as input. For accurately predicting DBP, first develop a reliable benchmark data set from the protein data bank. Second, using Amino Acid Composition (AAC), Position Specific Scoring Matrix (PSSM), Predicted Solvent Accessibility (PSA), and Predicted Probabilities of DNA-Binding Sites (PDBS) to produce four specific protein sequence baselines. Using a differential evolution algorithm, weights of the properties are taught. Based on those attained properties, merge the characteristics with weights to create an original super feature. And tensor-flow is used to paralyze the weights. A suitable feature selection algorithm of tensor flow’s binary classifier is used to extract the excellent subset from weighted feature vector. The training sample set is obtained in the training process, after generating final features. The classification is learned through the support vector machine and the tensor flow. And the output is measured using a tensor surface. The choice is done on the basis of threshold of likelihood and protein with above-threshold chance is considered to be DBP and others are non-DBP. 
Keywords: AAC, DBP,PSA, PSSM.