of the bottom of the well of genetic variation
in humans. In human genetics, it is generally
assumed that when the same variant is found
in more than one individual, it arose once in
an ancestor shared by those individuals, rather
than through independent mutations of the
same site. However, at a particular class of
site, called CpG dinucleotides, the researchers
make a convincing case that variants observed
in multiple individuals often reflect mutational
recurrence.
In support of their assertion, the researchers
find that discovery rates for new CpG dinucleo-
tide mutations decrease in samples larger than
20,000 individuals. This provides further
evidence that the size of the ExAC cohort is suf-
ficiently large that we are beginning to saturate
this class of human genetic variation, at least
within the exome. It is worth noting, however,
that CpG dinucleotides have a highly elevated
mutation rate in human genomes, making the
number of samples needed to observe such sat-
uration much lower than for other kinds of vari-
ants. Nonetheless, this exciting finding presages
what lies ahead, as larger aggregate analyses of
exomes and genomes are performed.
Third, ExAC promotes the discovery of
genes involved in rare diseases. In 2009, my
group and others showed how exome sequenc-
ing could be used to identify Mendelian-disease
genes or to diagnose Mendelian disease
1,7,8
.
Because there are tens of thousands of genetic
variants in an exome, these strategies depended
on effectively filtering out common variants,
which are not likely to cause Mendelian dis-
orders. At that time, databases of common
variants were uneven and of suspect quality.
Although ESP greatly improved the situation by
uniformly and systematically cataloguing both
common and rare variants across the exome
4
,
ExAC is an order of magnitude larger, and so
enables better filtering. This is especially rel-
evant for exome sequencing of non-European,
non-African-American individuals, because
ExAC provides greater sampling of individuals
from outside the United States than ESP does.
On a related point, the study finds that hun-
dreds of variants previously claimed to cause
Mendelian disorders occur at implausibly high
frequencies. As such, the authors suggest that
they be reclassified as benign. A related study
9
shows how ExAC may also force a reassessment
of whether some genes are involved at all in par-
ticular rare disorders. There is little doubt that
ExAC will both refine and accelerate Mende-
lian-gene discovery and clinical genetics.
Finally, the consortium’s approach to data
aggregation and sharing is admirable. ExAC
is both a technical and political achievement,
requiring wrangling not only of data but also of
investigators, consents and more from 14 stud-
ies — most of which were directed at the genet-
ics of various common diseases.
An ongoing challenge in genomics is balanc-
ing the privacy rights of human participants
with a strong tradition of promptly and openly
sharing data. Building on the precedent of ESP,
ExAC hits this balance by publicly releasing
aggregate analyses —a catalogue of variants
and the frequencies at which they arise — but
not data about associated traits or other
individual-level information (although raw data
for many studies in ExAC is theoretically acces-
sible through restricted databases). In this way,
the study maximizes benefit while minimizing
harm. These data have already been available
on a terrifically intuitive website for nearly two
years (http://exac.broadinstitute.org/), and the
site has accrued more than 4 million page views.
If there is one take-home message, it is
that there is incredible value in aggregating
sequencing data across genomic studies. As
the exomes aggregated by ExAC represent
just a small fraction of the human samples
that have been subjected to exome or genome
sequencing so far, we can and should do better.
In the coming decade, the number of human
genomes that will be sequenced in some man-
ner will grow to at least tens of millions and, by
the end of this century, perhaps even billions.
The beginnings of saturation seen here with
CpG dinucleotides may eventually be observed
deeply and at every site, providing a nucleo-
tide-level footprint of the human genome. ■
Jay Shendure is in the Department of Genome
Sciences, University of Washington, Seattle,
Washington 98195, USA, and is an investigator
of the Howard Hughes Medical Institute.
e-mail: shendure@uw.edu
1. Ng, S. B. et al. Nature 461, 272–276 (2009).
2. Shendure, J. & Ji, H. Nature Biotechnol. 26,
1135–1145 (2008).
3. Lek, M. et al. Nature 536, 285–291 (2016).
4. Fu, W. et al. Nature 493, 216–220 (2013).
5. The 1000 Genomes Project Consortium Nature
526, 68–74 (2015).
6. The GTEx Consortium. Science 348, 648–660 (2015).
7. Ng, S. B. et al. Nature Genet. 42, 30–35 (2010).
8. Choi, M. et al. Proc. Natl Acad. Sci. USA 106,
19096–19101 (2009).
9. Walsh, R. et al. Genet. Med. http://dx.doi.org/
10.1038/GIM.2016.90 (2016).
Protein-coding
AGACTATAGAGATC GATAGATATAGCGATA
AGACAATAGAGATC GATACATATAGCTATA
DNA
Individual 1
Individual 2
...
...
AGACTATAGAGATC GATACATATAGCGATA
ACACTATAGAGATC GATACATATAGCTATA
Individual 60,705
Individual 60,706
– X –– X –––– X ––– X –––– X ––– X ––– X ––– Sites of variation
Exome sequencing
Figure 1 | Exome aggregation. The Exome Aggregation Consortium (ExAC)
3
reanalysed the raw DNA-
sequencing data from the protein-coding part of the genome, known as the exome, of 60,706 individuals,
aggregated from 14 distinct studies. Genetic variants (red) are compared to produce a database of all sites
of variation between the individuals.
NEUROSCIENCE
Flipping the
sleep switch
Inactivation of a group of sleep-promoting neurons through dopamine signalling
can cause acute or chronic wakefulness in flies, depending on changes in three
different potassium-channel proteins. See Letter p.333
STEPHANE DISSEL & PAUL J. SHAW
M
any people have nodded off during
a long road trip, or lain in bed
desperately trying to fall asleep.
These experiences illustrate real-world con-
sequences of an improperly maintained bal-
ance between sleep- and wake-promoting
neural circuits. On page 333, Pimentel et al.
1
describe the identification of a bona fide
molecular switch that allows wake-promoting
signals to turn off individual sleep-promoting
neurons to regulate waking. These find-
ings open up avenues for understanding the
complexity of sleep regulation in healthy
individuals and during disease.
Multiple sleep and wake circuits are found
throughout the mammalian central nervous
system and are believed to interact in a mutually
inhibitory manner
2,3
. A similar organization
is found in the fruitfly Drosophila, in which
independent sleep and wake centres cooperate
278 | NATURE | VOL 536 | 18 AUGUST 2016
NEWS & VIEWS RESEARCH ©2016MacmillanPublishersLimited,partofSpringerNature.Allrightsreserved.