Identification of New Repeating Motifs in Titin Marion Greaser * University of Wisconsin, Muscle Biology Laboratory, Madison, Wisconsin ABSTRACT Repeating motifs of 26 –28 amino acids have been identified in the PEVK region of the giant elastic protein titin. These motifs, termed PPAK for the four amino acids that often constitute the beginning of the motif, occur 60 times in human soleus titin. PPAK motifs occur in groups of 2–12 that are separated by regions rich in glutamic acid (approximately 45%) and termed polyE segments. The fluctuation of the net charge between the PPAK and polyE regions suggests ionic interactions be- tween these segments and their involvement in the elastic function of titin. Proteins 2001;43:145–149. © 2001 Wiley-Liss, Inc. Key words: protein domains; elasticity; PEVK; ionic interactions; secondary structure; muscle INTRODUCTION Titin, also called connectin, is a .3-million dalton protein found in heart and skeletal muscle. 1,2 The protein is involved in a number of functions: connecting the thick filaments to the Z-lines, 3 serving as a template for assem- bly of the thick filaments, 4 preventing the sarcomere from being overstretched, 5,6 acting as a serine/threonine ki- nase, 7,8 and playing a role in sarcomere assembly. 9 –13 It is believed to be the major component responsible for passive tension. 14,15 Additional information on this unusual pro- tein can be found in several recent reviews. 16 –19 Since the earliest titin cDNA sequence became avail- able, it was clear that a number of repeating motifs were involved in its amino acid sequence. 20 Two different types of 100-amino acid motifs were found with similarities to immunoglobulin and the fibronectin III domains. Part of the A-band region of titin also consisted of an 11-motif super-repeat structure. 7 Additional super-repeats have been found in the tandem Ig domains in the I-band section. 21 A 45-amino acid repeat has been identified near the Z line end with varying numbers ($7) expressed in different muscles. 22 Shorter serine–proline repeats consid- ered as potential phosphorylation sites have also been found in regions of titin near the M-line 23 and Z-line. 24 A unique feature of the titin sequence is the PEVK segment in the I-band region. 25 The PEVK is so named because approximately 75% of the amino acid residues are proline (P), glutamic acid (E), valine (V), and lysine (K). The length of the PEVK segment varies between 163 and 2174 residues, with the N2B isoform of the heart having the shorter length and the soleus muscle having the much longer segment. 25 Studies using antibodies to label muscle at various degrees of extension have demonstrated that PEVK lengthens with stretch. 26,27 Thus, the PEVK region is believed to serve an elastic function in muscle. Although several reports have mentioned sequence repetitions in the PEVK, the nature of such repeats has not been described. The current paper describes two new types of titin repeating sequence that constitute the bulk of the PEVK. MATERIALS AND METHODS A variety of gene and protein analysis software was used in the current study. These include BLAST, 28,29 MEME, 30 MotifSearch, 31 PeptideStructure, 32 PeptideSort, and Pileup. Most of these methods were used through SeqWeb Version 1.1 of the Genetics Computer Group Wisconsin Package Version 10. RESULTS AND DISCUSSION A search of the PEVK sequence for repeating structure was prompted by the observation that a titin monoclonal antibody (9D10) 33 labeled two 0.55-mm-wide zones per sarcomere that corresponded to the PEVK region of hu- man soleus titin. 34 Since (1) there are only two sets of titin molecules per sarcomere, 3,35,36 (2) all the molecules are aligned in parallel in each set, and (3) most monoclonals label a zone no wider than 10 nm, the broad staining zone implied that multiple sequence regions were being recog- nized by the antibody. Numerous examples of the amino acid sequence PPAK were first found in the human soleus titin sequence (GenBank accession number X90569.1 25 ) by visual inspection, and these occurred at 26 –28 amino acid intervals. 37 A training set consisting of the best 23 se- quences was used with the MEME program and a motif width of 28. The MEME output was then used in Motif- Search with the database. A total of 320 matches were recognized in the human soleus titin with this program, but there were numerous instances of overlap, presumably because of the amino acid redundancy and short within- motif sequence repeats. Motif borders were selected and overlap sequences eliminated with the help of the position P-values. 30 Motif positions 3– 6 (typically KVP) were also useful in constructing a final alignment since this area of the motif showed the most limited variability. Grant sponsor: College of Agricultural and Life Sciences, University of Wisconsin–Madison; Grant sponsor: National Institutes of Health; Grant number: HL62466. *Correspondence to: Marion Greaser, University of Wisconsin, Muscle Biology Laboratory, Madison, Wisconsin. E-mail: mgreaser@ facstaff.wisc.edu Received 17 July 2000; Accepted 8 December 2000 PROTEINS: Structure, Function, and Genetics 43:145–149 (2001) © 2001 WILEY-LISS, INC.