Protein structure homology modeling using SWISS-MODEL workspace Lorenza Bordoli, Florian Kiefer, Konstantin Arnold, Pascal Benkert, James Battey & Torsten Schwede Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Klingelbergstrasse 50/70, CH 4056 Basel, Switzerland. Correspondence should be addressed to T.S. (torsten.schwede@unibas.ch). Published online 11 December 2008; corrected online 18 June 2009 (details online); doi:10.1038/nprot.2008.197 Homology modeling aims to build three-dimensional protein structure models using experimentally determined structures of related family members as templates. SWISS-MODEL workspace is an integrated Web-based modeling expert system. For a given target protein, a library of experimental protein structures is searched to identify suitable templates. On the basis of a sequence alignment between the target protein and the template structure, a three-dimensional model for the target protein is generated. Model quality assessment tools are used to estimate the reliability of the resulting models. Homology modeling is currently the most accurate computational method to generate reliable structural models and is routinely used in many biological applications. Typically, the computational effort for a modeling project is less than 2 h. However, this does not include the time required for visualization and interpretation of the model, which may vary depending on personal experience working with protein structures. INTRODUCTION The three-dimensional structure of a protein provides important information for understanding its biochemical function and inter- action properties in molecular detail. However, the number of known protein sequences is much larger than the number of experimentally solved protein structures. As of August 2008, more than 52,500 experimentally determined protein structures were deposited in the Protein Data Bank (PDB) 1 . Yet, this number appears relatively small compared with the more than 6 million protein sequences held in the UniProt knowledge database 2 . For- tunately, the number of different protein fold families occurring in nature appears to be limited 3 , and within a protein family, structural similarity between two homologous proteins can be inferred from sequence similarity 4 . Homology modeling (or com- parative protein structure modeling) techniques have been devel- oped to build three-dimensional models of a protein (target) from its amino-acid sequence on the basis of an alignment with a similar protein with known structure (template) 5–7 . In cases where no suitable template structure can be identified, de novo (a.k.a. ab initio) structure prediction methods can be used to generate three- dimensional protein models without relying on a homologus template structure. However, despite recent progress in the field, de novo predictions are limited to relatively small proteins and fall short in terms of accuracy compared with comparative models 8–12 . Therefore, homology modeling is the method of choice to build reliable three-dimensional in silico models of a protein in all cases where template structures can be identified. Homology models are widely used in many applications, such as virtual screening, designing site-directed mutagenesis experiments or in rationalizing the effects of sequence variations 13–17 . Stable, reliable and accurate systems for automated homology modeling are therefore required, which are easy to use for both nonspecialists and experts in structural bioinformatics. Homology modeling Homology modeling in general consists of four main steps: (i) identifying evolutionarily related proteins with experimentally solved structures that can be used as template(s) for modeling the target protein of interest; (ii) mapping corresponding residues of target sequence and template structure(s) by means of sequence alignment methods and manual adjustment; (iii) building the three-dimensional model on the basis of the alignment; and (iv) evaluating the quality of the resulting model 14,15 . This proce- dure can be iterated until a satisfactory model is obtained (Fig. 1). Protein structure homology modeling relies on the evolutionary relationship between the target and template proteins. Potential structural templates are identified using a search for homologous proteins in a library of experimentally determined protein struc- tures. From the resulting list of possible candidate structures, a template structure is chosen on the basis of its suitability according to various criteria such as the level of similarity between the query and template sequences, the experimental quality of the solved structures, the presence of ligands or cofactors and so on. Ideally, a large segment of the query sequence should be covered by a single high-quality template, although in many cases, the available tem- plate structures will correspond to only one or more distinct structural domains of the protein. p u o r G g n i h s i l b u P e r u t a N 9 0 0 2 © natureprotocols / m o c . e r u t a n . w w w / / : p t t h Known structures (templates) Target sequence Template selection Alignment template–target Structure modeling Structure evaluation and assessment Homology model(s) Figure 1 | The four main steps of comparative protein structure modeling: template selection, target–template alignment, model building and model quality evaluation. NATURE PROTOCOLS | VOL.4 NO.1 | 2009 | 1 PROTOCOL