Equally Parsimonious Pathways Through an RNA Sequence Space Are Not Equally Likely Youn-Hyung Lee, Lisa M. DSouza, George E. Fox Department of Biochemical and Biophysical Sciences, 3201 Cullen Blvd., University of Houston, Houston, TX 77204-5934, USA Received: 27 August 1996 / Accepted: 14 April 1997 Abstract. An experimental system for determining the potential ability of sequences resembling 5S ribosomal RNA (rRNA) to perform as functional 5S rRNAs in vivo in the Escherichia coli cellular environment was devised previously. Presumably, the only 5S rRNA sequences that would have been fixed by ancestral populations are ones that were functionally valid, and hence the actual historical paths taken through RNA sequence space dur- ing 5S rRNA evolution would have most likely utilized valid sequences. Herein, we examine the potential valid- ity of all sequence intermediates along alternative equally parsimonious trajectories through RNA sequence space which connect two pairs of sequences that had previously been shown to behave as valid 5S rRNAs in E. coli. The first trajectory requires a total of four changes. The 14 sequence intermediates provide 24 ap- parently equally parsimonious paths by which the tran- sition could occur. The second trajectory involves three changes, six intermediate sequences, and six potentially equally parsimonious paths. In total, only eight of the 20 sequence intermediates were found to be clearly in- valid. As a consequence of the position of these invalid intermediates in the sequence space, seven of the 30 possible paths consisted of exclusively valid sequences. In several cases, the apparent validity/invalidity of the intermediate sequences could not be anticipated on the basis of current knowledge of the 5S rRNA structure. This suggests that the interdependencies in RNA se- quence space may be more complex than currently ap- preciated. If ancestral sequences predicted by parsimony are to be regarded as actual historical sequences, then the present results would suggest that they should also sat- isfy a validity requirement and that, in at least limited cases, this conjecture can be tested experimentally. Key words: Sequence space — Shape space — 5S rRNA — Parsimony — Vibrio proteolyticus — Evolu- tionary paths Introduction It has been suggested that the evolution of a macromol- ecule can be understood as an exploration of a sequence space comprising all possible primary sequences (Smith 1970; Ninio 1983; Eigen et al. 1988). Those sequences that satisfy the requirements of biological function are favorably selected. Typically, function is facilitated by proper three-dimensional folding, and, hence, the allow- able sequences of a particular macromolecule will com- prise a ‘‘structure space’’ or ‘‘shape space.’’ Thus, the evolutionary history of a macromolecule can be under- stood as an exploration of a local region of a shape space composed of a subset of all possible primary sequences that satisfy structural constraints imposed by functional requirements. Some progress in predicting membership in a shape space has been made by consideration of folding constraints (Schuster et al. 1994). The ability to explicitly predict or determine the sequences comprising such a shape space would provide considerable insight into the evolutionary potential of the macromolecule un- der consideration. For instance, by comparing sequences actually found in nature to those that comprise the shape Correspondence to: G.E. Fox J Mol Evol (1997) 45:278–284 © Springer-Verlag New York Inc. 1997