PHOSPHORYLATION SITE PREDICTION TOOL FOR CHANNEL PROTEINS USING STATISTICAL MACHINE LEARNING TECHNIQUES Devasia Arun George 1 Kwoh Chee Keong 1 deva0009@ntu.edu.sg aschkwoh@ntu.edu.sg Nikhil Prasanna Jayan 1 nikh0008@ntu.edu.sg 1 School of Computer Engineering, Nanyang Technological University, Singapore. Keywords: protein phosphorylation, channel proteins, statistical machine learning, HMM, ANN, SVM. Abstract 1 Introduction Protein phosphorylation network acts like a gateway of information to key cellular processes. Most of the phosphorylation processes are reversible and highly kinase-specific in nature. Besides phosphorylation reactions also exhibit high substrate specificity. Hence identifying the kinase-specific phosphorylation sites in proteins will enable better understanding of molecular mechanisms. Phosphorylation in channel proteins modifies their functional properties and thereby regulates all downstream signaling processes. However they have not been well characterized in channel proteins. The authors believe that prediction of phosphorylation sites in channel proteins will enable better understanding of their diverse signal transduction pathways. 2 Method and Results The authors are in progress of creating a tool that predicts the protein phosphorylation process in the channel proteins. The prediction tool incorporates secondary features of the channel proteins such as structural and conformation details, hydrophobicity, and the evolutionary relationship. Datasets were created from protein kinase A (PKA) and protein kinase C (PKC) specific phosphorylation sites in channel proteins. These datasets were trained using different statistical method such as HMM, ANN and SVM. A combined architecture is made using these well trained models which has the capability to predict the phosphorylation sites much efficiently and accurately. 3 Discussions Though this prediction architecture is on channel proteins and only two kinases, authors believe that their predictions will generalize well for other protein families and protein kinases as well due to the close familial relationship.