ANNOTATED SEQUENCE RECORD The complete genome sequence of Canna yellow streak virus W. A. Monger I. P. Adams R. H. Glover B. Barrett Received: 28 February 2010 / Accepted: 6 May 2010 / Published online: 22 May 2010 Ó Springer-Verlag 2010 Abstract Canna yellow streak virus (Potyvirus, Potyvir- idae) was sequenced using the novel method of next-gen- eration pyrosequencing. The complete genome was found to be 9,502 nucleotides excluding the poly-A tail with a predicted genome organisation typical for a member of the genus Potyvirus. As with other potyviruses that infect monocotyledons, some of the predicted cleavage sites of the polyprotein genome were unusual, such as a glutamic acid/threonine (E/T) between the CI and 6K2 proteins and a glutamic acid/aspartic acid (E/D) between the NIa and NIb proteins. Evidence of the presence of endogenous pararetroviruses in the canna genome was found from the large number of sequences obtained with this method. Introduction Canna (family Cannaceae) is an ornamental plant that originates from Central and South America and is trans- ported worldwide in the form of rhizomes. This form of vegetative propagation has made viruses a major source of concern to growers and breeders. Canna yellow streak virus (CaYSV) is a newly reported virus of canna plants first identified in 2007 [13]. This virus was found to be responsible for speckling and streaking symptoms on the large leaves of different varieties of canna plants. Partial sequences of the virus have been obtained from cannas from European countries as well as Israel and South Africa confirming this virus to be a worldwide problem. The virus is a member of the genus Potyvirus, family Potyviridae. In this study the full genome sequence and genome structure of CaYSV was determined through next-generation pyrosequencing. Virus material and sequence analysis The CaYSV infected plant (Canna lily, variety Panama) had been maintained at FERA for the previous 2 years. This plant was originally held in a UK collection. The virus was sequenced using the pyrosequencing method described by [1]. Briefly, total RNA was extracted from both CaYSV containing and uninfected canna leaves. Double stranded cDNA was produced using tagged random and oligo dT primers. PCR amplification was performed using Tag primers. The cDNA from uninfected canna was amplified using a nucleotide mix containing biotin-16-dUTP. The cDNA from infected canna was amplified with unlabelled nucleotides. A subtractive hybridisation was performed with the biotinylated uninfected canna cDNA bound to streptavidin beads. The resulting enriched infected sample was amplified again with tagged primers and the products were blunt ended before sequencing. Sequencing was performed using a GS-FLX Genome Sequencer (Advanced Genomics Facility, Liverpool University, UK). The 5 0 prime end of the genome was confirmed by rapid ampli- fication of cDNA ends (RACE), using the SMART RACE kit (Clontech, USA). Contig assembly was performed using the CLC bio Genomics Workbench assembler (CLC bio, Denmark). 92,653 sequence reads were generated totalling 26,173,231 nucleotides (nt), which condensed to 2,215 contigs and 28,517 unassembled sequences. Analysis using BLAST-N and BLAST-X assigned total sequences W. A. Monger (&) Á I. P. Adams Á R. H. Glover Á B. Barrett The Food and Environment Research Agency (FERA), Sand Hutton, York YO41 1LZ, UK e-mail: wendy.monger@fera.gsi.gov.uk 123 Arch Virol (2010) 155:1515–1518 DOI 10.1007/s00705-010-0694-0