Introduction Understanding the details of integral membrane protein biogenesis is important for the study of any process or pathway that involves these proteins, including signaling cascades, vesicle trafficking and intercellular communication. Structural information is commonly used to predict protein function, and an important feature of the tertiary structure of an integral membrane protein is its topology or its distribution relative to the membrane. Very few integral membrane proteins have had their topology determined experimentally, however, and of those proteins examined, several exhibit topological heterogeneity. That is, polypeptides with identical sequences can span the membrane differently. Researchers therefore commonly rely on topology prediction algorithms, which we will discuss after reviewing the details of biosynthesis. Although these algorithms are helpful for providing a first approximation, they are often imprecise and sometimes predict incorrect topology (see below). An appreciation of the complexity of integral membrane protein biosynthesis empowers scientists to think more critically about a variety of problems: when the data does not exactly fit the model, an alternate topological form may be part of the explanation. Here we focus on the biosynthesis of mammalian integral membrane proteins that use one or more α-helical membrane- spanning domains to integrate into the lipid bilayer. Some integral membrane proteins have a single membrane-spanning domain (bitopic), others have several (polytopic). Bitopic membrane proteins are categorized according to the properties of their transmembrane (TM) domains (Fig. 1). During biogenesis, the N-terminus of a type I integral membrane protein is in the ER lumen, whereas in a type II integral membrane protein the N-terminus is in the cytoplasm. Integral membrane proteins that use their first transmembrane domain as both a signal sequence and a stop transfer sequence are classified as signal-anchored proteins. C-terminally anchored proteins have a signal-anchored domain at the extreme C- terminus. Overview of integral membrane protein biogenesis Biosynthesis of integral membrane proteins involves several interrelated events: targeting of the nascent chain to the ER, translocation of all necessary domains into the ER lumen, recognition and proper orientation of TM domains, integration of TM domains into the lipid bilayer and, in some cases, formation of multimeric complexes. Nucleus-encoded proteins begin translation in the cytosol. Secretory and integral membrane proteins have a signal sequence that is recognized by the signal recognition particle (SRP) shortly after emerging from the ribosome (Walter and Johnson, 1994). Through interactions with its receptor on the surface of the ER, SRP transfers the ribosome-nascent-chain complex to the translocon, an aqueous pore in the ER membrane responsible for translocation and integration (Corsi and Schekman, 1996; Matlack et al., 1998; Fulga et al., 2001). At the ER, upon entering the translocon, integral membrane proteins differ from secretory proteins in that translocation stops and TM domains are oriented and integrated into the bilayer. In vivo the orientation and integration of membrane proteins determines protein topology and is coupled to protein folding (Booth and Curran, 1999; Sanders and Nagy, 2000). Synthesis of polytopic membrane proteins is more complex than that of bitopic membrane proteins. For example, instead of synthesizing the cytosolic domain of a type I membrane protein and then terminating translocation, the translocation machinery has to be switched on again and begin to translocate another TM domain, another lumenal domain, etc. How are these switches controlled? They are regulated by several factors that can act independently or in concert. The hydrophobicity of the TM domain plays an important role. 2003 Integral membrane protein biogenesis requires the coordination of several events: accurate targeting of the nascent chain to the membrane; recognition, orientation and integration of transmembrane (TM) domains; and proper formation of tertiary and quaternary structure. Initially unanticipated inter- and intra-protein interactions probably mediate each stage of biogenesis for single spanning, polytopic and C-terminally anchored membrane proteins. The importance of these regulated interactions is illustrated by analysis of topology prediction algorithm failures. Misassigned or misoriented TM domains occur because the primary sequence and overall hydrophobicity of a single TM domain are not the only determinants of membrane integration. Key words: Translocon, Endoplasmic reticulum, Biogenesis, Signal transduction, Topogenesis Summary Integral membrane protein biosynthesis: why topology is hard to predict Carolyn M. Ott 1 and Vishwanath R. Lingappa 2,3, * 1 Program in Biological Sciences, University of California, San Francisco, CA 94143-0444, USA 2 Departments of Physiology and 3 Medicine, University of California, San Francisco, CA 94143-0444, USA *Author for correspondence (e-mail: vrl@itsa.ucsf.edu) Journal of Cell Science 115, 2003-2009 (2002) © The Company of Biologists Ltd Commentary