ChIP-on-chip protocol for genome-wide analysis
of transcription factor binding in Drosophila
melanogaster embryos
Thomas Sandmann
1
, Janus S Jakobsen
1
& Eileen E M Furlong
1
EMBL Heidelberg, Meyerhofstrasse 1, 69117 Heidelberg, Germany. Correspondence should be addressed to E.E.M.F. (furlong@embl.de).
Published online 25 January 2007; doi:10.1038/nprot.2006.383
This protocol describes a method to detect in vivo associations between proteins and DNA in developing Drosophila embryos.
It combines formaldehyde crosslinking and immunoprecipitation of protein-bound sequences with genome-wide analysis using
microarrays. After crosslinking, nuclei are enriched using differential centrifugation and the chromatin is sheared by sonication.
Antibodies specifically recognizing wild-type protein or, alternatively, a genetically encoded epitope tag are used to enrich for
specifically bound DNA sequences. After purification and polymerase chain reaction-based amplification, the samples are
fluorescently labeled and hybridized to genomic tiling microarrays. This protocol has been successfully used to study different
tissue-specific transcription factors, and is generally applicable to in vivo analysis of any DNA-binding proteins in Drosophila embryos.
The full protocol, including the collection of embryos and the collection of raw microarray data, can be completed within 10 days.
INTRODUCTION
A large number of DNA-binding and chromatin-associated pro-
teins are recruited to organize, package, transcribe or repair the
genetic information stored in the nucleus of eukaryotic cells. An
important step toward understanding any of these (and other)
processes at a molecular level is to determine the sites of direct
protein–DNA interactions.
ChIP-on-chip allows genome-wide protein–DNA interactions to
be assayed in vivo
Approaches such as DNAseI footprinting
1
and electrophoretic
mobility shift assays
2
allow monitoring of binding events in vitro,
whereas chromatin immunoprecipitation studies are a valuable
tool to probe these essential interactions in vivo. By co-precipitating
the protein of interest with its associated DNA sequences from
cellular or embryonic chromatin extracts, a snapshot of binding site
occupancy—reflecting complex variables such as chromatin acces-
sibility, combinatorial binding and/or competition with other
DNA-associated factors—can be obtained (Fig. 1). Typically, che-
mical reagents or ultraviolet irradiation are used to stabilize
transient interactions by introducing covalent crosslinks. Formal-
dehyde, a small molecule that readily penetrates biological samples,
induces (partially) reversible crosslinks between e-lysine groups of
proteins and (i) neighboring peptide bonds (protein–protein cross-
linking) or (ii) amino groups of partially denatured DNA bases
(protein–DNA crosslinks)
3,4
. Ultraviolet irradiation, on the other
hand, induces irreversible bonds exclusively between nucleic acids
and directly bound proteins, but cannot penetrate more than a few
cell layers, requiring dissociation of cells from larger tissues and
embryos
5–7
.
After crosslinking, cells are lysed and the chromatin is sheared, to
fragments typically between 0.2 and 1 kb in length, and sequences
specifically bound by the protein of interest are purified by
immunoprecipitation. Traditionally, Southern blot
8
or quantitative
real-time polymerase chain reaction (qPCR)
9
have been used to
p u o r G g n i h s i l b u P e r u t a N 7 0 0 2 © natureprotocols / m o c . e r u t a n . w w w / / : p t t h
Stage-matched
wt embryos
Formaldehyde fixation,
preparation of
chromatin
Immunoprecipitation
Crosslink reversal,
amplification,
Klenow labeling
Hybridization
to tiling
arrays
Statistical
analysis
Visualization,
data integration
Labeled genomic DNA
Figure 1 | Schematical overview of ChIP-on-chip experiments. A population
of wild-type (or transgenic) D. melanogaster embryos is dechorionated
and covalent bonds between proteins as well as proteins and nucleic acids
are introduced by formaldehyde crosslinking. Shearing the DNA allows
immunoprecipitation of short sequences associated with the protein of
interest. After partial reversal of the crosslinks, these can be amplified,
labeled and hybridized against a genomic reference to genomic tiling arrays.
Raw data processing followed by statistical analysis allows identification of
significantly enriched sequences bound by the protein of interest.
NATURE PROTOCOLS | VOL.1 NO.6 | 2006 | 2839
PROTOCOL