Interactions in Oligonucleotide Hybrid Duplexes on Microarrays
Hans Binder,*
,²
Toralf Kirsten,
²
Ivo L. Hofacker,
‡
Peter F. Stadler,
²,§
and Markus Loeffler
², |
Interdisciplinary Centre for Bioinformatics, UniVersity of Leipzig, Institute of Theoretical Chemistry and
Structural Biology, UniVersity of Vienna, Bioinformatics group, Department of Computer Science, and
Institute for Medical Informatics, Statistics and Epidemiology, UniVersity of Leipzig, Kreuzstrasse 7b,
D-4103 Leipzig, Germany
ReceiVed: January 29, 2004; In Final Form: August 23, 2004
We investigated Affymetrix GeneChip intensity data in terms of chip-averaged sensitivities over all perfect
match (PM) and mismatch (MM) probes possessing a common triple of neighboring bases in the middle of
their sequence. This approach provides a model-independent estimation of base-specific contributions to the
probe sensitivities. We found that fluorescent labels attached to nucleotide bases forming Watson-Crick
(WC) pairs in most cases decrease their binding affinity and, thus, decrease the sensitivity of the probe.
Single-base-related mean sensitivity values rank in ascending order according to C > G ≈ T > A. The
central base of PM and MM probes mainly forms WC pairings in duplexes with nonspecific transcripts,
which obviously dominate the chip-averaged sensitivity values. Linear combinations of the triple-averaged
probe sensitivities provide nearest-neighbor (NN) sensitivity terms, which rank in a similar order as the
respective NN free-energy terms obtained from previous thermodynamic studies on the stability of RNA/
DNA duplexes in solution. Systematic deviations between both data sets can be mostly attributed to the
labeling of the target RNA in the chip experiments. Our results provide a set of molecular NN and single-
base-related interaction parameters which consider specific properties of duplex formation in microarray
hybridization experiments.
Introduction
Target binding to high-density oligonucleotide microarrays
used for gene expression experiments is governed by the
molecular interactions in the hybrid duplexes formed by RNA
fragments and DNA probes. The knowledge of the details of
the DNA/RNA hybridization behavior on a molecular level and
its estimation by means of effective parameters represents one
prerequisite for selecting optimal probe sequences from target
genes for newly designed chips. Especially short oligonucle-
otides might be ineffective as RNA binders as a result of
relatively weak interactions between probe and target. Existing
methods for chip design mostly involve thermodynamic criteria
based on interaction parameters referring to hybrid duplexes in
solution for the optimization of probe sequences (see refs 1, 2
and references therein). Recent analyses show that several
factors, such as the presence of fluorescent labels, modifies the
stability of RNA/DNA duplexes on microarrays compared with
duplexes in solution.
3,4
The understanding of the hybridization
properties of microarray probes presumably requires a modified
view of the molecular interactions in DNA/RNA duplexes,
which takes into account labeling and also, possibly, effects
due to the fixation of the probes at the quartz surface.
Available microarray intensity data are directly related to the
binding affinity of the individual probes.
4
They therefore provide
valuable information about molecular interactions in RNA/DNA
duplexes, which can be used to extract relevant interaction
parameters. In this work, we make use of two types of
redundancies in the design of Affymetrix GeneChip microarrays,
which were created to improve the reliability of the method.
5,6
First, so-called probe sets consisting of 11-20 different
reporter probes for each gene allows us to estimate the sensitivity
of a probe as the deviation of its intensity from the respective
set average in a logarithmic scale.
4
The sensitivity of a micro-
array oligonucleotide probe characterizes its ability to detect a
certain amount of RNA transcripts independently of the condi-
tions of sample preparation, hybridization, and measurement
of the fluorescence intensity. It is mainly determined by the
affinity of a particular DNA probe to bind RNA fragments via
complementary Watson-Crick (WC) pairs.
Second, each probe is present in pairs of so-called perfect
match (PM) and mismatch (MM) modifications. The sequence
of the PM is taken from the gene of interest, and thus, it is
complementary to a 25-mer in the RNA target sequence. The
sequence of the MM is identical with that of the PM probe
except the position in the middle of the oligomer where the
middle base is replaced by its complementary base. The pairwise
design of probes intends to measure the amount of nonspecific
hybridization and, by this way, to correct the PM intensities.
An important question for GeneChip data analysis is how to
include the MM intensities adequately. One prerequisite for
solving this issue is the detailed study of the effect of the MM
base in probe-target duplexes on the signal intensity.
In the accompanying paper,
4
we found that the middle base
systematically shifts the PM and MM probe sensitivities relative
to another. Also, other studies reported that the strength of base-
pair interaction in the middle of the oligonucleotide affects the
affinity of the probes for target binding to an extraordinary
extend.
3,7
In addition, stacking interactions between nearest
* Corresponding author. E-mail: binder@izbi.uni-leipzig.de. Fax: ++49-
341-1495-119.
²
Interdisciplinary Centre for Bioinformatics.
‡
Institute of Theoretical Chemistry and Structural Biology.
§
Department of Computer Science.
|
Institute for Medical Informatics, Statistics and Epidemiology.
18015 J. Phys. Chem. B 2004, 108, 18015-18025
10.1021/jp049592o CCC: $27.50 © 2004 American Chemical Society
Published on Web 10/27/2004