JOURNAL OF COMPUTATIONAL BIOLOGY Volume 10, Number 2, 2003 © Mary Ann Liebert, Inc. Pp. 215–229 DNA Library Design for Molecular Computation ROBERT PENCHOVSKY and JÖRG ACKERMANN ABSTRACT A novel approach to designing a DNA library for molecular computation is presented. The method is employed for encoding binary information in DNA molecules. It aims to achieve a practical discrimination between perfectly matched DNA oligomers and those with mis- matches in a large pool of different molecules. The approach takes into account the ability of DNA strands to hybridize in complex structures like hairpins, internal loops, or bulge loops and computes the stability of the hybrids formed based on thermodynamic data. A dynamic programming algorithm is applied to calculate the partition function for the ensemble of structures, which play a role in the hybridization reaction. The applicability of the method is demonstrated by the design of a twelve-bit DNA library. The library is constructed and experimentally tested using molecular biology tools. The results show a high level of specic hybridization achieved for all library words under identical conditions. The method is also applicable for the design of primers for PCR, DNA sequences for isothermal amplication reactions, and capture probes in DNA-chip arrays. The library could be applied for inte- grated DNA computing of twelve-bit instances of NP-complete combinatorial problems by multi-step DNA selection in microow reactors. Key words: DNA library, DNA computation, code design, free energy, partition function. INTRODUCTION F ollowing the experiment of Adleman (1994), it has been hypothesized that, with a large quantity of DNA, bio-molecular-based computers may offer the possibility that massive parallelism could be used for solving NP complete problems in polynomial time (Gifford, 1994; Lipton, 1995). Instances of NP complete problems, such as the maximum clique problem (Quyang et al. , 1997) and the SAT problem (Liu et al. , 2000; Braich et al. , 2001, 2002) have been solved using DNA/DNA hybridization. A key question in DNA computing is the delity of the basic operations employed and the scalability of the computation as a whole (James et al. , 1998; Cox et al. , 1999; Pevzner et al. , 2001). When DNA/DNA hybridization is used as a basic computational operation, the accuracy of the computation will depend on the ability to discriminate between perfectly matching hybrids (the bits of the library and their complementary oligomers) and those with mismatches. In this regard, the quality of the DNA code design is playing a critical role in the delity of the computation. The problem of designing sets of modular RNA and DNA sequences, which hybridize in a predened way, is fundamental not only for molecular computing but also Biomolecular Information Processing (BioMIP), Fraunhofer Gesellschaft, Schloss Birlinghoven, D-53754 Sankt Augustin, Germany. 215