© 2000 Macmillan Magazines Ltd
“H
ow often have I said that when you
eliminate the impossible, whatever
remains, however improbable, must
be the truth?” exhorted the great sleuth
1
. The
principle of arriving at the truth by elimina-
tion is ancient; but on page 175 of this issue,
Liu et al.
2
report a new technique for mas-
sively parallel elimination, which harnesses
the power of DNA chemistry and biotech-
nology to solve a particularly difficult prob-
lem in mathematical logic.
The difficulty of finding solutions to
mathematical problems is classified by the
speed at which the best algorithm can com-
pute their solutions. ‘Easy’ problems have
algorithms with ‘running times’ that scale
as a polynomial function of the number of
variables (polynomial time or P problems).
There is also a class of problems charac-
terized by proofs that are easy to verify
(non-deterministic polynomial time or NP
problems), such as the famous travelling
salesman problem. In the worst case, ‘hard’
NP problems have running times that grow
exponentially with the number of the vari-
ables. For example, finding a factor of a
given natural number N cannot be done in
polynomial time, but verifying that another
number d is a factor of N is easy. Computer
scientists have been intensively studying
whether sequential algorithms can solve all
NP problems in polynomial time, but the
answer is still unknown.
In 1994, Leonard Adleman
3
shocked the
computing world by presenting a DNA-
based polynomial-time method for the
Hamilton path problem (Fig. 1a), the prob-
lem of finding an airline flight path between
several cities on a map such that each city is
visited only once. This NP problem is known
to be one of the hardest. In order to achieve
the small computation time, Adleman traded
space (the amount of DNA needed) for time
(the number of biochemical steps to be
used). His key insight was that cities on a
map, and paths between pairs of cities, may
be encoded in strands of DNA. Millions of
DNA strands, diffusing in a liquid, can self-
assemble into all possible flight-path config-
urations, from which a judicious series of
molecular manoeuvres can fish out the cor-
rect solution. Adleman, combining elegance
with brute force, could isolate the one true
solution out of many possibilities.
Every NP problem can be seen as the
search for a solution that simultaneously
satisfies a number of logical clauses, each
composed of three variables (which can be
true or false), connected by ‘or’ statements:
for example ( x
1
OR x
2
OR x
–
3
) AND ( x
–
4
OR x
5
OR x
–
6
). This particular problem, known as
3-SAT, is the hardest of all NP problems. Liu
et al.
2
show how to solve a simple case of
3-SAT in a reasonable amount of time by
using a brute-force search made possible by
the parallel nature of their DNA computing
techniques.
They begin with a string of binary num-
bers representing the variables in a given
3-SAT formula. Such a binary string can be
represented by a unique sequence of nucleo-
tides in single-stranded DNA; for example,
TGCGG might stand for 001. For n variables,
there are 2
n
unique answer (or Watson)
strands, so for three variables you need eight
Watson strands. For each Watson strand,
there is also a complementary Crick strand
created by the base-pairing rule — A bonds
to T, and C bonds to G. The goal is to identify
those strings out of a library of eight that
satisfy all the clauses of a particular 3-SAT
formula (Fig. 1b).
Liu et al.
2
first immobilized the Watson
DNA strings corresponding to all candidate
solutions on a specially treated gold surface.
Next they added all possible Crick strands
that will stick to a Watson string satisfying
the first clause. Such pairing creates double-
stranded DNA. The remaining single-
stranded molecules are those that do not
satisfy the first clause, and these are
destroyed by enzymes. The surface is then
heated to melt away the complementary
strands, washed and a fresh collection of
Crick strands is paired with strings satisfying
the second clause. This cycle is repeated for
each of the remaining clauses. At the end,
only those strands whose sequence satisfies
the original formula survive.
In this system, the DNA ‘answers’ are
attached randomly to the surface (rather
than in an ordered array) so, to read out the
answer, the surviving strands first have to
be amplified using the polymerase chain
reaction. Their identities are then deter-
mined by pairing with an ordered array of
strings identical to the original set of
sequences. Not counting the number of steps
required to produce the DNA molecules in
the first place, the algorithm takes only
(3k+1) steps, where k is the number of
clauses, for a brute-force evaluation of all 2
n
possible answers. This represents a remark-
able improvement over the best conventional
NATURE| VOL 403 | 13 JANUARY2000 | www.nature.com 143
news and views
DNA computing on a chip
Mitsunori Ogihara and Animesh Ray
In a DNA computer, the input and output are both strands of DNA. A
computer in which the strands are attached to the surface of a chip can
now solve difficult problems quite quickly.
2
1
3
4 5
6 7
a
x
1
x
2
x
3
0 0 0 ATGCC 1
0 0 1 TGCGG 2
0 1 0 AAGCG 3
0 1 1 CCTAT 4
1 0 0 TAGAC 5
1 0 1 GGATT 6
1 1 0 CTTCG 7
1 1 1 GTAAT 8
Binary
string DNA string Surface
b
Figure 1 The parallel power of DNA computing. a, An example of the Hamilton path problem solved
by Adleman
3
. Can you go from node 1 to node 7 using only the paths shown such that you visit all
the nodes exactly once? The answer is positive. b, The hardest of such computationally difficult or
NP problems is 3-SAT. In order to find a solution to the 3-SAT problem defined by these two
clauses (x
1
OR x
2
OR x
–
3
) AND (x
–
1
OR x
2
OR x
–
3
), Liu et al.
2
attach DNA strings encoding all possible
answers to a specially treated surface. Complementary DNA strands that satisfy the first clause are
added to the solution, and stick to strands numbered 1 and 3–8. The remaining single strand 2 is
destroyed by enzymes. The complementary strands are removed and the surface is washed. The
cycle is repeated for the second clause, which results in the destruction of strand 6. The identities
of the remaining strands are read out to give the correct solutions to the problem: 000, 010, 011,
100, 110, 111.