letters nature structural biology • volume 7 number 5 • may 2000 375 Structural analysis of WW domains and design of a WW prototype Maria J. Macias 1,2 , Virginie Gervais 1–3 , Concepcion Civera 2 and Hartmut Oschkinat 1 1 Forschungsinstitut für Molekulare Pharmakologie, Alfred-Kowalke-Str. 4, 10315 Berlin, Germany. 2 European Molecular Biology Laboratory, Meyerhofstr. 1, 69117 Heidelberg, Germany. 3 Present address: École supérieure de Biotechnologie-CNRS-UPR 9003, Boulevard Sébastien Brant, Pole API, 67400 Strasbourg-Illkirch, France. Two new NMR structures of WW domains, the mouse formin binding protein and a putative 84.5 kDa protein from Saccharomyces cerevisiae, show that this domain, only 35 amino acids in length, defines the smallest monomeric triple- stranded antiparallel β-sheet protein domain that is stable in the absence of disulfide bonds, tightly bound ions or ligands. The structural roles of conserved residues have been studied using site-directed mutagenesis of both wild type domains. Crucial interactions responsible for the stability of the WW structure have been identified. Based on a network of highly conserved long range interactions across the β-sheet struc- ture that supports the WW fold and on a systematic analysis of conserved residues in the WW family, we have designed a folded prototype WW sequence. Noncatalytic domains of signaling proteins 1–3 are useful for investigating sequence–structure–function relationships. The WW domain is the simplest natural β-sheet structure. It is a 35- residue protein module found in signaling and regulatory pro- teins with two highly conserved tryptophans and a strictly conserved proline 4–6 . The NMR structure of the human Yap65WW domain in complex with the proline rich peptide GTPPPPYTVG 7 revealed a slightly curved, three-stranded antiparallel β-sheet with the peptide bound onto the plain of the sheet. However, in this study a construct of 57 amino acids was necessary to obtain a folded structure, raising questions con- cerning the predicted domain length. These concerns were sup- ported by the crystal structure of the human mitotic rotamase PIN1, which contains a WW domain at its N-terminus and a prolyl-isomerase catalytic domain at the C-terminus, the WW domain supposedly being stabilized by several hydrogen bonds to the catalytic domain 8 . To clarify domain boundaries and structure determinants, all known WW sequences were classified and examples of two new types were selected for structure determination using NMR. The sequences were divided into three groups based on the pres- ence or absence of semi-conserved residues (Fig. 1 a): (i) sequences containing the C-terminal tryptophan and N-ter- minal proline; (ii) sequences lacking proline at the N-terminus; and (iii) sequences lacking the C-terminal tryptophan. The Yap65WW and PIN1WW domains belong to the first class of sequences. Thus, the structures of two new WW domains have been determined, the mouse formin binding protein (FBP28WW) 9 and the highly divergent YJQ8WW domain from a hypothetical 84.5 kDa protein of Saccharomyces cerevisiae. They represent sequences of the second and third classes, respectively. The structures of these two shorter and phyloge- netically distant WW domains were compared to those of Yap65WW and the PIN1WW. The FBP28WW and YJQ8WW structures The FBP28WW and the YJQ8WW constructs consist of 37 and 34 amino acids, respectively (Fig. 1a). The pattern of secondary structure NOEs observed for FBP28WW and YJQ8WW is simi- lar to that of Yap65WW (Fig. 2). Two pairs of hydrogen bonds connect each pair of strands (Fig. 2) as monitored by H 2 O/ 2 H 2 O exchange followed by NMR. Ensembles of the 10 best structures for both FBP28WW and YJQ8WW domains are shown in Fig. 3a,b. Experimental restraints and structural statistics are summarized in Table 1. The number of sequential and long range restraints per residue is quite different for the two structures; 15 for FBP28WW and only 8 for YJQ8WW. In the spectra of the latter, broad signals are Fig. 1 Analysis of WW sequences. a, Sequence alignment of selected WW sequences. The alignment is divided into three groups as described in the text. The representative WW sequences corresponding to Yap65WW (black), FBP28WW (green) and YJQ8WW (red) are shown below each subfamily. The prototype sequence is shown in blue with conserved posi- tions underlined. b, Distance analysis of the sequences of Yap65WW, PIN1WW, FBP28WW, YJQ8WW and the prototype calculated with ClustalW. c, Bar representation of the distribution of amino acids per position in the WW domain family. A matrix was generated with an in- house program that counts the number of times a certain residue appears at any particular position of the alignment. The minimal number of appearances required for a residue to be considered representative of the position was eight. Bars marked with an asterisk represent those positions considered in the design of the prototype sequence. c a b (i) (ii) (iii) (iv) © 2000 Nature America Inc. • http://structbio.nature.com © 2000 Nature America Inc. • http://structbio.nature.com