ORIGINAL PAPER Jetty S. S. Ammiraju Æ Yeisoo Yu Æ Meizhong Luo Dave Kudrna Æ HyeRan Kim Æ Jose L. Goicoechea Yuichi Katayose Æ Takashi Matsumoto Æ Jianzhong Wu Takuji Sasaki Æ Rod A. Wing Random sheared fosmid library as a new genomic tool to accelerate complete finishing of rice (Oryza sativa spp. Nipponbare) genome sequence: sequencing of gap-specific fosmid clones uncovers new euchromatic portions of the genome Received: 6 May 2005 / Accepted: 15 August 2005 / Published online: 1 October 2005 Ó Springer-Verlag 2005 Abstract The International Rice Genome Sequencing Project has recently announced the high-quality finished sequence that covers nearly 95% of the japonica rice genome representing 370 Mbp. Nevertheless, the current physical map of japonica rice contains 62 physical gaps corresponding to approximately 5% of the genome, that have not been identified/represented in the comprehen- sive array of publicly available BAC, PAC and other genomic library resources. Without finishing these gaps, it is impossible to identify the complete complement of genes encoded by rice genome and will also leave us ignorant of some 5% of the genome and its unknown functions. In this article, we report the construction and characterization of a tenfold redundant, 40 kbp insert fosmid library generated by random mechanical shear- ing. We demonstrated its utility in refining the physical map of rice by identifying and in silico mapping 22 gap- specific fosmid clones with particular emphasis on chromosomes 1, 2, 6, 7, 8, 9 and 10. Further sequencing of 12 of the gap-specific fosmid clones uncovered unique rice genome sequence that was not previously reported in the finished IRGSP sequence and emphasizes the need to complete finishing of the rice genome. Introduction The International Rice Genome Sequencing Project (IRGSP; http://rgp.dna.affrc.go.jp/IRGSP), a consor- tium of ten countries has recently announced a finished genome covering 95% of the rice genome. The IRGSP followed a gold-standard approach for rice genome sequencing that relied on first, establishing an exhaustive sequence-ready integrated genetic and physical map comprised of BAC fingerprints assembled into contigs using software FPC (Soderlund et al. 2000; Chen et al. 2002) and sequence-tagged connectors (Venter et al. 1996; Mao et al. 2000). Second, genetically anchored seed BACs or PACs are shotgun sequenced and finished. Minimally overlapping clones flanking the seed BAC/ PACs are then selected based on the sequence-ready physical map and sequenced as above. This clone-by- clone approach is reiterated over and over until the complete genome is sequenced. Each base pair was se- quenced ten times on an average, ensuring an error rate of less than one base in 10,000. The finished sequence now has 370 Mbp contig size. Nevertheless, the present IRGSP integrated genetic and physical map of japonica rice still contains about 62 physical gaps that depending on the chromosome, range in size from 10 to 100 kbp in euchromatic regions and up to 1 Mbp in centromeric regions, corresponding to nearly 5% of the missed segments in the reported rice genome sequence. These include 9 centromeric, 17 telo- meric and 36 arm-specific gaps that are distributed on several chromosomes, with an approximately estimated total size of 18.1 Mbp (IRGSP, 2005). Although, the rice genome is now considered ‘‘complete’’ and provides Communicated by Q. Zhang Fosmid library reported here is publicly available from our web site http://www.genome.arizona.edu/orders J. S. S. Ammiraju Æ Y. Yu Æ M. Luo Æ D. Kudrna H. Kim Æ J. L. Goicoechea Æ R. A. Wing (&) Department of Plant Sciences and BIO5 Institute, Arizona Genomics Institute, The University of Arizona, Tucson, AZ, 85721 USA E-mail: rwing@ag.arizona.edu Tel.: +1-520-6269595 Fax: +1-520-6211259 Y. Katayose Æ T. Matsumoto Æ J. Wu Æ T. Sasaki National Institute of Agrobiological Sciences, Tsukuba, Ibaraki, Japan Theor Appl Genet (2005) 111: 1596–1607 DOI 10.1007/s00122-005-0091-3