NATURE BIOTECHNOLOGY VOLUME 30 NUMBER 1 JANUARY 2012 105
RESOURCE
Asian cultivated rice (Oryza sativa) is thought to have been domesti-
cated from divergent populations of Asian wild rice, O. rufipogon and
O. nivara, >10,000 years ago
1,2
. During domestication, rice has under-
gone significant phenotypic changes in grain size, color, shattering,
seed dormancy and tillering. For decades, geneticists have used quanti-
tative trait locus mapping to localize the major causative genes respon-
sible for these traits, yielding a dozen trait-related genes in cultivated
rice (for example, sh4, rc and prog1)
3–6
. Additionally, a recent genome-
wide association study using genome-wide SNP data for 517 Chinese
landraces identified loci that may be associated with 14 agronomic
traits
7
. However, quantitative trait locus and gene mapping is labor
intensive and time consuming, taking years to construct segregating
populations and requiring intensive phenotyping and genotyping.
Association mapping is also prone to missing excellent alleles because
the favorable alleles tend to be rare and are difficult to detect during
regular association analyses
8
. A more recent report tried to identify
artificially selected genes
9
, but the strategy of pooling many accessions
(a strain identified by an International Rice Research Institute (IRRI)
accession number) together and using shallow sequencing coverage
provided limited variation data for rice. If a comprehensive catalog
of genome variation in both cultivated and wild rice were available,
it would greatly facilitate the identification of functional variations
in elite varieties by comparing genomic variation in an elite variety
with data from controls. Dense variation data will also be useful for
marker-assisted breeding and gene mapping of rice.
RESULTS
Sequencing and mapping
Cultivated rice is classified into two major subspecies of O. sativa
(indica and japonica) and is further subdivided into genetically dif-
ferentiated groups, including Glaszmann’s six groups (I to VI)
10
and Garris et al.’s five groups (indica, aus, aromatic, temperate
japonica and tropical japonica)
11
. We selected 40 cultivated rice
accessions to represent all of the major groups of Asian cultivated
rice (Supplementary Table 1), including 11 tropical japonica (TRJ),
8 temperate japonica (TEJ) and 6 aromatic (ARO) that belong to
japonica rice, and 4 aus (AUS) and 9 indica (IND) that belong to
indica rice (Supplementary Table 1). In addition, we sampled one
accession each from groups III and IV, proposed by Glaszmann
10
,
which were not included in a previous population study
11
. Among
these cultivars, 29 are considered to be landraces and 11 are improved
varieties. To strictly control the quality of our sequencing and SNP
calling, we also included the Nipponbare strain, which was used to
generate the reference rice genome sequence
12
. For wild rice samples,
Resequencing 50 accessions of cultivated and wild
rice yields markers for identifying agronomically
important genes
Xun Xu
1–3,12
, Xin Liu
2,12
, Song Ge
4,12
, Jeffrey D Jensen
5,12
, Fengyi Hu
6,12
, Xin Li
1,12
, Yang Dong
1,12
,
Ryan N Gutenkunst
7
, Lin Fang
2
, Lei Huang
3,4
, Jingxiang Li
2
, Weiming He
2,8
, Guojie Zhang
1,2,4
,
Xiaoming Zheng
3,4
, Fumin Zhang
3
, Yingrui Li
2
, Chang Yu
2
, Karsten Kristiansen
2,9
, Xiuqing Zhang
2
, Jian Wang
2
,
Mark Wright
10
, Susan McCouch
10
, Rasmus Nielsen
1,9,11
, Jun Wang
2,9
& Wen Wang
1
Rice is a staple crop that has undergone substantial phenotypic and physiological changes during domestication. Here we
resequenced the genomes of 40 cultivated accessions selected from the major groups of rice and 10 accessions of their wild
progenitors (Oryza rufipogon and Oryza nivara) to >15 × raw data coverage. We investigated genome-wide variation patterns in
rice and obtained 6.5 million high-quality single nucleotide polymorphisms (SNPs) after excluding sites with missing data in
any accession. Using these population SNP data, we identified thousands of genes with significantly lower diversity in cultivated
but not wild rice, which represent candidate regions selected during domestication. Some of these variants are associated with
important biological features, whereas others have yet to be functionally characterized. The molecular markers we have identified
should be valuable for breeding and for identifying agronomically important genes in rice.
1
CAS-Max Planck Junior Research Group on Evolutionary Genomics, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese
Academy of Sciences (CAS), Kunming, China.
2
BGI-Shenzhen, Shenzhen, China.
3
Graduate University of Chinese Academy Sciences, Beijing, China.
4
State Key
Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China.
5
School of Life Sciences, École Polytechnique
Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
6
Food Crops Research Institute, Yunnan Academy of Agricultural Sciences, Kunming, China.
7
Department
of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA.
8
South China University of Technology, Guangdong, China.
9
Department of Biology,
University of Copenhagen, Copenhagen, Denmark.
10
Department of Plant Breeding & Genetics, Cornell University, Ithaca, New York, USA.
11
Departments of
Integrative Biology and Statistics, University of California, Berkeley, USA.
12
These authors contributed equally to this work. Correspondence should be addressed to
W.W. (wwang@mail.kiz.ac.cn) or J.W. (wangj@genomics.org.cn) or R.N. (rasmus_nielsen@berkeley.edu).
Received 3 June; accepted 25 October; published online 11 December 2011; doi:10.1038/nbt.2050
npg
© 2012 Nature America, Inc. All rights reserved.