Int. J. Bioinformatics Research and Applications, Vol. 1, No. 3, 2006 319
Copyright © 2006 Inderscience Enterprises Ltd.
Improved protein fold assignment using support
vector machines
Robert E. Langlois, Alice Diec,
Ognjen Perisic, Yang Dai and Hui Lu*
Department of Bioengineering,
University of Illinois at Chicago,
60607 Illinois, USA
Fax: 312 413 2018 E-mail: rlangl1@uic.edu
E-mail: adiec1@uic.edu E-mail: operis1@uic.edu
E-mail: yangdai@uic.edu E-mail: huilu@uic.edu
*Corresponding author
Abstract: Because of the relatively large gap of knowledge between number of
protein sequences and protein structures, the ability to construct a
computational model predicting structure from sequence information has
become an important area of research. The knowledge of a protein’s structure is
crucial in understanding its biological role. In this work, we present a support
vector machine based method for recognising a protein’s fold from sequence
information alone, where this sequence has less similarity with sequences of
known structures. We have focused on improving multi-class classification,
parameter tuning, descriptor design, and feature selection. The current
implementation demonstrates better prediction accuracy than previous similar
approaches, and has similar performance when compared with straightforward
threading.
Keywords: fold recognition; support vector machines; machine learning;
proteomics; structure prediction.
Reference to this paper should be made as follows: Langlois, R.E., Diec, A.,
Perisic, O., Dai, Y. and Lu, H. (2006) ‘Improved protein fold assignment using
support vector machines’, Int. J. Bioinformatics Research and Applications,
Vol. 1, No. 3, pp.319–334.
Biographical notes: Robert Ezra Langlois is a second year PhD student of
Bioinformatics in Department of Bioengineering at University of Illinois at
Chicago. He earned BS Degree in Bioengineering at UIC, May 2003. Currently
he is supported by a NIH training grant: Cellular Signaling in Cardiovascular
System. His research interests include machine learning, protein folding,
structure prediction, protein function prediction, and binding prediction of
signaling proteins.
Alice Diec earned her Masters Degree in Bioinformatics from the Department
of Bioengineering at UIC, October 2004. Currently, she is working in the
Washington University Genome Center.
Ognjen Perisic is a third year PhD student of Bioinformatics in Department of
Bioengineering at UIC. His research interests are in computational biophysics,
free energy calculation, non-equilibrium statistical physics in biology, and
protein structure prediction.