Plant Molecular Biology 36: 767–774, 1998. 767 c 1998 Kluwer Academic Publishers. Printed in Belgium. Molecular evolution of cdc2 pseudogenes in spruce (Picea) Anders Kvarnheden 1 , Victor A. Albert 2 and Peter Engstr ¨ om Department of Physiological Botany, Uppsala University, Villav¨ agen 6, 752 36 Uppsala, Sweden; 1 Current address: Hort Research, Mt Albert Research Centre, Private Bag 92 169, Auckland, New Zealand ( author for correspondence); 2 Current address: The New York Botanical Garden, 200th Street and Southern Boulevard, Bronx, NY 10458-5126, USA Received 1 April 1997; accepted in revised form 13 November 1997 Key words: genome size, molecular evolution, nucleotide substitution Abstract The p34 2 protein and other cyclin-dependent protein kinases (CDK) are important regulators of eukaryotic cell cycle progression. We have previously cloned a functional cdc2 gene from Picea abies and found it to be part of a family of related sequences, largely consisting of pseudogenes. We now report on the isolation of partial cdc2 pseudogenes from Picea engelmannii and Picea sitchensis, as well as partial functional cdc2 sequences from P. engelmannii, P. sitchensis and Pinus contorta. A high level of conservation between species was detected for these sequences. Phylogenetic analyses of pseudogene and functional cdc2 sequences, as well as the presence of shared insertions or deletions, support the division of most of the cdc2 pseudogenes into two subfamilies. New cdc2 pseudogenes appear to have been formed in Picea at a much higher rate than they have been obliterated by neutral mutations. The pattern of nucleotide changes in the cdc2 pseudogenes, as compared to a presumed ancestral functional cdc2 gene, was similar to that previously found in mammalian pseudogenes, with a strong bias for the transitions C to T and G to A, and the transversions C to A and G to T. Introduction Pseudogenes, formed by chromosomal duplications or transpositions, are important features of multi- gene families of large eukaryotic genomes. Processed pseudogenes, which derive from reverse transcription of mRNA templates, are both common and dominant in mammalian gene families [cf. 36]. One extreme example is the Muridae glyceraldehyde-3-phosphate dehydrogenase sequence group, in which there is only one functional copy but more than 300 processed pseudogenes [9]. Only few reports on the presence of processed pseudogenes exist also in plants [6, 18]. In the conifer Picea abies (L.) Karst. (Norway spruce), at least half of the gene family encoding cdc2 cyclin- The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession numbers X80842 (cdc2Pe), X80846 (cdc2Ps), X80845 (cdc2Pnc), X80843 (cdc2Pe 1), X80844 (cdc2Pe 2), X80847 (cdc2Ps 1), X80848 (cdc2Ps 2), X80849 (cdc2Ps 3), X80850 (cdc2Ps 4) and X80851 (cdc2Ps 5). dependent kinases are pseudogenes with missense or indel mutations [21]. The cdc2 pseudogenes from spruce bear one distinctive mark of RNA processing in that they lack the introns present in the functional cdc2 genes of spruce [21] and Arabidopsis thaliana [15]. This suggests that the genes are processed pseudo- genes [21], although the absence of introns may not be unique to this type of pseudogenes [3], and the gene fragments isolated were too short to reveal the pos- sible presence of two other marks of retrosequences, i.e. stretches of poly(A) at the 3 end and flanking short direct repeats. Cyclin-dependent kinases are critical components in the control of cell cycle progression in eukaryotes [reviewed in 29]. In plants, one to two functional cdc2 genes as well as more distantly related cdc2-like genes have been isolated from a range of species [reviewed in 16], including Norway spruce [21]. When tested, the plant cdc2 homologues have been able to complement cdc2/CDC28 mutations in yeast. This is in contrast to