F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr. 16. M. Kaviratne, S. M. Khan, W. Jarra, P. R. Preiser, Eukaryot. Cell 1, 926 (2002). 17. M. Haeggstrom et al., Mol. Biochem. Parasitol. 133, 1 (2004). 18. T. Y. Sam-Yellowe et al., Genome Res. 14, 1052 (2004). 19. J. Gorodkin, L. J. Heyer, S. Brunak, G. D. Stormo, Comput. Appl. Biosci. 13, 583 (1997). 20. Z. Bozdech et al., PLoS Biol. 1, E5 (2003). 21. K. G. Le Roch et al., Science 301, 1503 (2003). 22. A search engine to identify proteins containing the PlasmoHT motif is available at www.haldarlab. northwestern.edu. 23. X.-Z. Su et al., Cell 82, 89 (1995). 24. J. F. Kun et al., Mol. Biochem. Parasitol. 85, 41 (1997). 25. We thank W. Kibbe, L. Zhu, V. Haztimanikatis, A. Vania Apkarian, and A. Chenn for helpful discussion. Sup- ported by American Heart Association fellowship (0215246z to N.L.H.) and the NIH (HL69630, AI39071 to K.H.). PlasmoDB and GenBank identifica- tion codes, respectively: PFE1615c: NP_703661; PfHSP40: PFE0055c and NP_703357; PfEMP1 fragment chr4.glm_42. The PfEMP1 used for transmembrane domain and cytoplasmic tail has NCBI identification code AAB09769.1. Supporting Online Material www.sciencemag.org/cgi/content/full/306/5703/1934/ DC1 Materials and Methods Figs. S1 to S4 Table S1 Bioinformatic Data 13 July 2004; accepted 19 October 2004 10.1126/science.1102737 A Draft Sequence for the Genome of the Domesticated Silkworm (Bombyx mori) Biology analysis group: Qingyou Xia, 1 *. Zeyang Zhou, 1 * Cheng Lu, 1 * Daojun Cheng, 1 Fangyin Dai, 1 Bin Li, 1 Ping Zhao, 1 Xingfu Zha, 1 Tingcai Cheng, 1 Chunli Chai, 1 Guoqing Pan, 1 Jinshan Xu, 1 Chun Liu, 1 Ying Lin, 1 Jifeng Qian, 1 Yong Hou, 1 Zhengli Wu, 1 Guanrong Li, 1 Minhui Pan, 1 Chunfeng Li, 1 Yihong Shen, 1 Xiqian Lan, 1 Lianwei Yuan, 1 Tian Li, 1 Hanfu Xu, 1 Guangwei Yang, 1 Yongji Wan, 1 Yong Zhu, 1 Maode Yu, 1 Weide Shen, 1 Dayang Wu, 1 Zhonghuai Xiang 1 . Genome analysis group: Jun Yu, 2,3 *. Jun Wang, 2,3 * Ruiqiang Li, 2 * Jianping Shi, 2 Heng Li, 2 Guangyuan Li, 2 Jianning Su, 2 Xiaoling Wang, 2 Guoqing Li, 2 Zengjin Zhang, 2 Qingfa Wu, 2 Jun Li, 2 Qingpeng Zhang, 2 Ning Wei, 2 Jianzhe Xu, 2 Haibo Sun, 2 Le Dong, 2 Dongyuan Liu, 2 Shengli Zhao, 2 Xiaolan Zhao, 2 Qingshun Meng, 2 Fengdi Lan, 2 Xiangang Huang, 2 Yuanzhe Li, 2 Lin Fang, 2 Changfeng Li, 2 Dawei Li, 2 Yongqiao Sun, 2 Zhenpeng Zhang, 2 Zheng Yang, 2 Yanqing Huang, 2 Yan Xi, 2 Qiuhui Qi, 2 Dandan He, 2 Haiyan Huang, 2 Xiaowei Zhang, 2 Zhiqiang Wang, 2 Wenjie Li, 2 Yuzhu Cao, 2 Yingpu Yu, 3 Hong Yu, 3 Jinhong Li, 3 Jiehua Ye, 3 Huan Chen, 3 Yan Zhou, 3 Bin Liu, 2 Jing Wang, 2 Jia Ye, 3 Hai Ji, 2 Shengting Li, 2 Peixiang Ni, 2 Jianguo Zhang, 2 Yong Zhang, 2 Hongkun Zheng, 2 Bingyu Mao, 2 Wen Wang, 2 Chen Ye, 2 Songgang Li, 2 Jian Wang, 2,3 Gane Ka-Shu Wong, 2,3,4 . Huanming Yang 2,3 . We report a draft sequence for the genome of the domesticated silkworm (Bombyx mori), covering 90.9% of all known silkworm genes. Our estimated gene count is 18,510, which exceeds the 13,379 genes reported for Drosophila melanogaster. Comparative analyses to fruitfly, mosquito, spider, and butterfly reveal both similarities and differences in gene content. Silk fibers are derived from the cocoon of the silkworm Bombyx mori, which was domesti- cated over the past 5000 years from the wild progenitor Bombyx mandarina (1). Silk- worms are second only to fruitfly as a model for insect genetics, owing to their ease of rearing, the availability of mutants from genetically homogeneous inbred lines, and the existence of a large body of information on their biology (2). There are about 400 visible phenotypes, and È200 of these are assigned to linkage groups (3). Silkworms can also be used as a bioreactor for protein- aceous drugs and as a source of biomaterials. Here, we present a draft sequence of the silkworm genome with 5.9Â coverage. B. mori has 28 chromosomes. More than 1000 genetic markers have been mapped at an average spacing of 2 cM (È500 kb) (4). A physical map is being constructed through the fingerprinting and end sequencing of bacterial artificial chromosome (BAC) clones (5). Many expressed sequence tags (ESTs) have been produced (6), and a 3Â draft sequence has just been announced by the International Lepidopteran Genome Proj- ect (7). Our project is independent of, but complementary to, that of the consortium. Our sequence has been submitted to the DNA Data Bank of Japan/European Molec- ular Biology Laboratory/GenBank (project accession number AADK00000000, version AADK01000000) and is also accessible from our Web site (http://silkworm.genomics. org.cn) (8). ESTs discussed in this Report can be found at GenBank (accession num- bers CK484630 to CK565104). DNA for genome sequencing is derived from an inbred domesticated variety, Dazao (posterior silk gland, fifth-instar day 3, on a mix of 1225 males). A whole-genome shot- gun (9) technique was used, and our coverage is 5.9Â. Including the unassembled reads, the total estimated genome size is 428.7 Mb, or 3.6 and 1.54 times larger than that of fruitfly (10) and mosquito (11). The N50 contig and scaffold sizes are 12.5 kb and 26.9 kb. Our assembly contains 90.9% of the 212 known silkworm genes (with full-length cDNA se- quence), 90.9% of È16,425 EST clusters, and 82.7% of the 554 known genes from other Lepidoptera. Additional details of our quality analyses are given in the supporting online material (fig. S1 and tables S1 to S6). We developed a gene-finder algorithm BGF (BGI GeneFinder) (fig. S2), based on GenScan and FgeneSH. To determine a gene count for silkworm, one must correct for erroneous and partial predictions (Table 1). The final corrected gene count for silkworm is 18,510 genes, which far exceeds the official gene count of 13,379 for fruitfly 1 Southwest Agricultural University, Chongqing Beibei, 400716, China. 2 Beijing Institute of Genomics of Chinese Academy of Sciences, Beijing Genomics Institute, Beijing Proteomics Institute, Beijing 101300, China. 3 James D. Watson Institute of Genome Sciences of Zhejiang University, Hangzhou Genomics Institute, Key Laboratory of Genomic Bio- informatics of Zhejiang Province, Hangzhou 310008, China. 4 University of Washington Genome Center, Department of Medicine, University of Washington, Seattle, WA 98195, USA. *These authors contributed equally to this work. .To whom correspondence should be addressed. E-mail: xiaqy@swau.cq.cn (Q.X.), xzh@swau.cq.cn (Z.X.), junyu@genomics.org.cn (J.Y.), gksw@genomics. org.cn (G.K-S.W.), yanghm@genomics.org.cn (H.Y.) R EPORTS www.sciencemag.org SCIENCE VOL 306 10 DECEMBER 2004 1937