Evolving speciﬁcity from variability for protein interaction domains Tomonori Kaneko 1 , Sachdev S. Sidhu 2 and Shawn S.C. Li 1 1 Department of Biochemistry and the Siebens-Drake Research Institute, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, N6A 5C1, Canada 2 Banting and Best Department of Medical Research and Department of Molecular Genetics, University of Toronto, The Donnelly Centre, 160 College Street, Toronto, Ontario, M5S 3E1, Canada An important question in modular domain–peptide interactions, which play crucial roles in many biological processes, is how the diverse speciﬁcities exhibited by different members of a domain family are encoded in a common scaffold. Analysis of the Src homology (SH) 2 family has revealed that its speciﬁcity is determined, in large part, by the conﬁguration of surface loops that regulate ligand access to binding pockets. In a distinct manner, SH3 domains employ loops for ligand recogni- tion. The PDZ domain, in contrast, achieves speciﬁcity by co-evolution of binding-site residues. Thus, the confor- mational and sequence variability afforded by surface loops and binding sites provides a general mechanism by which to encode the wide spectrum of speciﬁcities observed for modular protein interaction domains. A dilemma with modular interaction domains Information processing in a cell depends on the assembly and dissolution of macromolecular complexes in a spatio- temporally controlled manner. This dynamic process of signal transduction frequently involves protein–protein interactions that are mediated by modular protein domains [1–4]. Over 100 families of modular domains have been identiﬁed to date [5,6], which makes them a signiﬁ- cant feature of the human genome. Some domains recog- nize oligonucleotides, lipids, or other small molecules, whereas others are devoted to mediating protein–protein interactions [7]. This latter group of protein interaction domains recognizes unmodiﬁed or post-translationally modiﬁed peptide sequences in their binding partners, thereby driving the formation of speciﬁc protein complexes and larger protein interaction networks. Aberrant protein– protein interactions underpin numerous diseases, notably cancer [8–11]. Comparative genomic analysis has revealed that the number of protein interaction domains has undergone a remarkable expansion during the evolution of unicellular to multicellular metazoans [12–14]. For instance, the Src homology (SH)3 domain, which recognizes proline-rich motifs, has grown from 27 members in yeast to approxi- mately 300 in humans [15,16]. The number of PDZ (postsynaptic density protein 95, PSD-95; discs large, Dlg; and zonula occludens-1, ZO-1) domains has increased from three in yeast to over 250 in human [5,17]. Similarly, the SH2 domain, the largest family of modular domains dedicated to the recognition of phosphotyrosine, has 120 members in the human genome but only one in yeast [18]. The expansion of protein interaction domains raises the question as to how the diverse speciﬁcity observed for a given domain family, which deﬁnes its biological function, is engendered by a common structural scaffold. Drawn from recent work on the SH2, SH3 and PDZ domains, we propose here a unifying paradigm for the evolvement of domain speciﬁcity in which variability of the surface loops and/or binding-site residues drives the rapid expan- sion of domain speciﬁcity. Surface loop conﬁguration dictates the major speciﬁcity classes of the SH2 domain By relaying signals from tyrosine kinases to downstream proteins, SH2 domains play a pivotal role in cellular signal transduction [18–21]. All SH2 domains comprise approxi- mately 100 amino acids and share a common fold charac- terized by a central seven-stranded b-sheet ﬂanked by two a-helices (Figure 1a) [22–24]. SH2 domains, in general, bind only to peptide sequences that contain a phosphory- lated tyrosine (pTyr) [25,26]. Each member, however, has a distinct preference for residues C-terminal to the pTyr. Earlier studies of SH2–peptide complexes led to the pro- posal of a two-prongs-engaging-two-binding-sites model to explain SH2 domain speciﬁcity [27,28]. In this model, the ﬁrst prong of the peptide, the pTyr, inserts into a positively charged pocket (characterized by an invariable Arg residue at bB5, Figure 1a), whereas the second prong, the C- terminal residue of the peptide, is accommodated by a second pocket on the surface of the SH2 domain. Although this elegant model captures the essence of many SH2 domain–ligand interactions, it does not provide a full explanation for the diverse speciﬁcity displayed by the SH2 domain family [1,26,29–34]. The speciﬁcities of approximately two-thirds of the human SH2 domains have been mapped by screening synthetic peptide libraries [25,26,33] and peptide library arrays [35]. These studies suggest that the SH2 domains can be divided into three major speciﬁcity classes based on the most signiﬁcant position-selectivity. The P+2 class (i.e. the second residue C-terminal to pTyr), represented by the growth factor receptor-bound protein 2 (GRB2) SH2 do- main, recognizes an Asn+2 residue; the P+3 class, typiﬁed by the Src SH2 domain, selects a residue (frequently Opinion Corresponding author: Li, S.S.C. (sli@uwo.ca) 0968-0004/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibs.2010.12.001 Trends in Biochemical Sciences, April 2011, Vol. 36, No. 4 183