X. Yao et al. (Eds.): PPSN VIII, LNCS 3242, pp. 511–521, 2004. © Springer-Verlag Berlin Heidelberg 2004 Designing Multiple-Use Primer Set for Multiplex PCR by Using Compact GAs Yu-Cheng Huang 1 , Han-Yu Chuang 1 , Huai-Kuang Tsai 1 , Chun-Fan Chang 2,* , and Cheng-Yan Kao 1,* 1 Dept. of Computer Science and Information Engineering, National Taiwan University, Taiwan {r91021,r90002,d7526010,cykao}@csie.ntu.edu.tw 2 Chinese Culture University, Taiwan chunfan@ms17.hinet.net Abstract. Reducing the number of needed primers in multiplex polymerase chain reaction experiments is useful, or even essential, in large scale genomic research. In this paper, we transform this multiple-use primer design problem into a set-covering problem, and propose a modified compact genetic algorithm (MCGA) approach to disclose optimal solutions. Our experimental results dem- onstrate that MCGA effectively reduces the primer numbers of multiplex PCR experiments among whole-genome data sets of four test species within a feasi- ble computation time, especially when applied on complex genomes. Moreover, the performance of MCGA further exhibits better global stability of optimal so- lutions than conventional heuristic methods that may fall into local optimal traps. 1 Introduction Molecular analyses and extended diagnostic applications are often restricted by lim- ited availability of biological materials. The Polymerase Chain Reaction (PCR) [1], which uses primers to amplify specific DNA segments, is thus with crucial essence to current genomic researches, such as constructing full-genome spotted microarrays [2] on the preparation of DNA spotting material. Multiplex PCR [3], while using multi- ple primers to concurrently amplify multiple target DNA segments in single reaction [4], is considered as a time and reagent saving technique for simultaneous amplifica- tion of different targets, respectively. In current multiplex PCR, the primer length is often designed between 17 and 25 nucleotides (nt) and the number of primers is ex- actly twice of target number (with forward and reverse primer pair). The primer length ranging 17~25 nt is due to that the specificity by all random permutations of 17 nt (4 17 approximately equals to 1.7×10 10 ) has already exceeded the size of human genome (3×10 9 bps) and therefore would cause least random priming in human ge- nome. Many primer selection programs have been commercialized [5, 6], such as Primer 3 [7], and focused on designing unique left and right primers for each gene * Correspondance authors.