Protein structure homology modeling using
SWISS-MODEL workspace
Lorenza Bordoli, Florian Kiefer, Konstantin Arnold, Pascal Benkert, James Battey & Torsten Schwede
Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Klingelbergstrasse 50/70, CH 4056 Basel, Switzerland. Correspondence should be addressed to T.S.
(torsten.schwede@unibas.ch).
Published online 11 December 2008; corrected online 18 June 2009 (details online); doi:10.1038/nprot.2008.197
Homology modeling aims to build three-dimensional protein structure models using experimentally determined structures of related
family members as templates. SWISS-MODEL workspace is an integrated Web-based modeling expert system. For a given target
protein, a library of experimental protein structures is searched to identify suitable templates. On the basis of a sequence alignment
between the target protein and the template structure, a three-dimensional model for the target protein is generated. Model quality
assessment tools are used to estimate the reliability of the resulting models. Homology modeling is currently the most accurate
computational method to generate reliable structural models and is routinely used in many biological applications. Typically, the
computational effort for a modeling project is less than 2 h. However, this does not include the time required for visualization and
interpretation of the model, which may vary depending on personal experience working with protein structures.
INTRODUCTION
The three-dimensional structure of a protein provides important
information for understanding its biochemical function and inter-
action properties in molecular detail. However, the number of
known protein sequences is much larger than the number of
experimentally solved protein structures. As of August 2008,
more than 52,500 experimentally determined protein structures
were deposited in the Protein Data Bank (PDB)
1
. Yet, this number
appears relatively small compared with the more than 6 million
protein sequences held in the UniProt knowledge database
2
. For-
tunately, the number of different protein fold families occurring in
nature appears to be limited
3
, and within a protein family,
structural similarity between two homologous proteins can be
inferred from sequence similarity
4
. Homology modeling (or com-
parative protein structure modeling) techniques have been devel-
oped to build three-dimensional models of a protein (target) from
its amino-acid sequence on the basis of an alignment with a similar
protein with known structure (template)
5–7
. In cases where no
suitable template structure can be identified, de novo (a.k.a. ab
initio) structure prediction methods can be used to generate three-
dimensional protein models without relying on a homologus
template structure. However, despite recent progress in the field,
de novo predictions are limited to relatively small proteins and fall
short in terms of accuracy compared with comparative models
8–12
.
Therefore, homology modeling is the method of choice to build
reliable three-dimensional in silico models of a protein in all cases
where template structures can be identified.
Homology models are widely used in many applications, such as
virtual screening, designing site-directed mutagenesis experiments
or in rationalizing the effects of sequence variations
13–17
. Stable,
reliable and accurate systems for automated homology modeling
are therefore required, which are easy to use for both nonspecialists
and experts in structural bioinformatics.
Homology modeling
Homology modeling in general consists of four main steps: (i)
identifying evolutionarily related proteins with experimentally
solved structures that can be used as template(s) for modeling
the target protein of interest; (ii) mapping corresponding residues
of target sequence and template structure(s) by means of sequence
alignment methods and manual adjustment; (iii) building the
three-dimensional model on the basis of the alignment; and
(iv) evaluating the quality of the resulting model
14,15
. This proce-
dure can be iterated until a satisfactory model is obtained (Fig. 1).
Protein structure homology modeling relies on the evolutionary
relationship between the target and template proteins. Potential
structural templates are identified using a search for homologous
proteins in a library of experimentally determined protein struc-
tures. From the resulting list of possible candidate structures, a
template structure is chosen on the basis of its suitability according
to various criteria such as the level of similarity between the query
and template sequences, the experimental quality of the solved
structures, the presence of ligands or cofactors and so on. Ideally, a
large segment of the query sequence should be covered by a single
high-quality template, although in many cases, the available tem-
plate structures will correspond to only one or more distinct
structural domains of the protein.
p u o r G g n i h s i l b u P e r u t a N 9 0 0 2 © natureprotocols / m o c . e r u t a n . w w w / / : p t t h
Known structures
(templates)
Target
sequence
Template selection
Alignment
template–target
Structure modeling
Structure evaluation and
assessment
Homology
model(s)
Figure 1 | The four main steps of comparative protein structure modeling:
template selection, target–template alignment, model building and model
quality evaluation.
NATURE PROTOCOLS | VOL.4 NO.1 | 2009 | 1
PROTOCOL