Evolving specificity from variability for protein interaction domains Tomonori Kaneko 1 , Sachdev S. Sidhu 2 and Shawn S.C. Li 1 1 Department of Biochemistry and the Siebens-Drake Research Institute, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, N6A 5C1, Canada 2 Banting and Best Department of Medical Research and Department of Molecular Genetics, University of Toronto, The Donnelly Centre, 160 College Street, Toronto, Ontario, M5S 3E1, Canada An important question in modular domainpeptide interactions, which play crucial roles in many biological processes, is how the diverse specificities exhibited by different members of a domain family are encoded in a common scaffold. Analysis of the Src homology (SH) 2 family has revealed that its specificity is determined, in large part, by the configuration of surface loops that regulate ligand access to binding pockets. In a distinct manner, SH3 domains employ loops for ligand recogni- tion. The PDZ domain, in contrast, achieves specificity by co-evolution of binding-site residues. Thus, the confor- mational and sequence variability afforded by surface loops and binding sites provides a general mechanism by which to encode the wide spectrum of specificities observed for modular protein interaction domains. A dilemma with modular interaction domains Information processing in a cell depends on the assembly and dissolution of macromolecular complexes in a spatio- temporally controlled manner. This dynamic process of signal transduction frequently involves proteinprotein interactions that are mediated by modular protein domains [14]. Over 100 families of modular domains have been identified to date [5,6], which makes them a signifi- cant feature of the human genome. Some domains recog- nize oligonucleotides, lipids, or other small molecules, whereas others are devoted to mediating proteinprotein interactions [7]. This latter group of protein interaction domains recognizes unmodified or post-translationally modified peptide sequences in their binding partners, thereby driving the formation of specific protein complexes and larger protein interaction networks. Aberrant protein protein interactions underpin numerous diseases, notably cancer [811]. Comparative genomic analysis has revealed that the number of protein interaction domains has undergone a remarkable expansion during the evolution of unicellular to multicellular metazoans [1214]. For instance, the Src homology (SH)3 domain, which recognizes proline-rich motifs, has grown from 27 members in yeast to approxi- mately 300 in humans [15,16]. The number of PDZ (postsynaptic density protein 95, PSD-95; discs large, Dlg; and zonula occludens-1, ZO-1) domains has increased from three in yeast to over 250 in human [5,17]. Similarly, the SH2 domain, the largest family of modular domains dedicated to the recognition of phosphotyrosine, has 120 members in the human genome but only one in yeast [18]. The expansion of protein interaction domains raises the question as to how the diverse specificity observed for a given domain family, which defines its biological function, is engendered by a common structural scaffold. Drawn from recent work on the SH2, SH3 and PDZ domains, we propose here a unifying paradigm for the evolvement of domain specificity in which variability of the surface loops and/or binding-site residues drives the rapid expan- sion of domain specificity. Surface loop configuration dictates the major specificity classes of the SH2 domain By relaying signals from tyrosine kinases to downstream proteins, SH2 domains play a pivotal role in cellular signal transduction [1821]. All SH2 domains comprise approxi- mately 100 amino acids and share a common fold charac- terized by a central seven-stranded b-sheet flanked by two a-helices (Figure 1a) [2224]. SH2 domains, in general, bind only to peptide sequences that contain a phosphory- lated tyrosine (pTyr) [25,26]. Each member, however, has a distinct preference for residues C-terminal to the pTyr. Earlier studies of SH2peptide complexes led to the pro- posal of a two-prongs-engaging-two-binding-sites model to explain SH2 domain specificity [27,28]. In this model, the first prong of the peptide, the pTyr, inserts into a positively charged pocket (characterized by an invariable Arg residue at bB5, Figure 1a), whereas the second prong, the C- terminal residue of the peptide, is accommodated by a second pocket on the surface of the SH2 domain. Although this elegant model captures the essence of many SH2 domainligand interactions, it does not provide a full explanation for the diverse specificity displayed by the SH2 domain family [1,26,2934]. The specificities of approximately two-thirds of the human SH2 domains have been mapped by screening synthetic peptide libraries [25,26,33] and peptide library arrays [35]. These studies suggest that the SH2 domains can be divided into three major specificity classes based on the most significant position-selectivity. The P+2 class (i.e. the second residue C-terminal to pTyr), represented by the growth factor receptor-bound protein 2 (GRB2) SH2 do- main, recognizes an Asn+2 residue; the P+3 class, typified by the Src SH2 domain, selects a residue (frequently Opinion Corresponding author: Li, S.S.C. (sli@uwo.ca) 0968-0004/$ see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibs.2010.12.001 Trends in Biochemical Sciences, April 2011, Vol. 36, No. 4 183