REVIEW Methods for the bioinformatic identification of bacterial lipoproteins encoded in the genomes of Gram-positive bacteria Obaidur Rahman Æ Stephen P. Cummings Æ Dean J. Harrington Æ Iain C. Sutcliffe Received: 30 April 2008 / Accepted: 15 June 2008 / Published online: 27 June 2008 Ó Springer Science+Business Media B.V. 2008 Abstract Bacterial lipoproteins are a diverse and func- tionally important group of proteins that are amenable to bioinformatic analyses because of their unique signal peptide features. Here we have used a dataset of sequences of experimentally verified lipoproteins of Gram-positive bacteria to refine our previously described lipoprotein recognition pattern (G+LPP). Sequenced bacterial gen- omes can be screened for putative lipoproteins using the G+LPP pattern. The sequences identified can then be validated using online tools for lipoprotein sequence identification. We have used our protein sequence datasets to evaluate six online tools for efficacy of lipoprotein sequence identification. Our analyses demonstrate that LipoP (http://www.cbs.dtu.dk/services/LipoP/) performs best individually but that a consensus approach, incorpo- rating outputs from predictors of general signal peptide properties, is most informative. Keywords Lipoproteins Á Signal peptides Á Bioinformatics Á Genomics Á Firmicutes Á Actinobacteria Introduction Bacterial lipoproteins (Lpp) are a functionally diverse class of membrane anchored proteins that typically represent ca. 2% of the bacterial proteome (Sutcliffe and Harrington 2002; Sutcliffe and Harrington 2004; Babu et al. 2006; Sutcliffe and Hutchings ), although in some taxa the proportion is even higher (Bendtsen et al. 20052007; Setubal et al. 2006). Lpp are of particular significance in Gram-positive bacteria as, in the absence of an outer membrane, various proteins must be tethered to the plasma membrane in order to be retained within the cell envelope. Thus many Lpp of Gram-positive bacteria have functions directly comparable to those of periplasmic or surface proteins in Gram-negative bacteria. For example, the substrate binding proteins which deliver substrates to the integral membrane components of ABC importer systems are typically Lpp in Gram-positive bacte- ria and periplasmic proteins in Gram-negative bacteria (Sutcliffe and Russell 1995). Consequently, many of the known or predicted functions of Gram-positive bacterial Lpp reflect their predicted localisation at the interface between the cell membrane and the extracytoplasmic compartment. Thus, in addition to the well defined category of substrate binding Lpp, a brief selection of Lpp functions include roles as enzymes; in sensing environmental cues; in membrane- associated redox processes; and in correct protein export and localisation (Sutcliffe and Russell 1995; Sutcliffe and Har- rington 2004; Sutcliffe and Hutchings 2007). This functional versatility means that it is extremely useful to be able to identify putative Lpp in order to gain further insights into the biology of biotechnologically and medically significant organisms. Moreover, the accurate prediction of protein localisation by sequence analysis is clearly an important aspect of genome annotation and, eventually, understanding of protein function (Gardy and Brinkman 2006). Electronic supplementary material The online version of this article (doi:10.1007/s11274-008-9795-2) contains supplementary material, which is available to authorized users. O. Rahman Á S. P. Cummings Á I. C. Sutcliffe Northumbria University, Newcastle upon Tyne NE1 8ST, UK D. J. Harrington University of Bradford, West Yorkshire BD7 1DP, UK I. C. Sutcliffe (&) Biomolecular and Biomedical Research Centre, School of Applied Science, Northumbria University, Newcastle upon Tyne NE1 8ST, UK e-mail: iain.sutcliffe@unn.ac.uk 123 World J Microbiol Biotechnol (2008) 24:2377–2382 DOI 10.1007/s11274-008-9795-2