High Speed Homology Search with FPGAs ∗ Yoshiki YAMAGUCHI, Tsutomu MARUYAMA Institute of Engineering Mechanics and Systems, University of Tsukuba, 1-1-1 Ten-ou-dai Tsukuba Ibaraki, 305-8573, JAPAN Akihiko KONAGAYA Japan Advanced Institute of Science and Technology, 1-1 Asahidai Tatsunokuchi Ishikawa, 923-1292, JAPAN Japan Riken Genomic Sciences Center, 1-7-22 Suehiro Tsurumi Yokohama Kanagawa, 230-0045, JAPAN We will introduce a way how we can achieve high speed homology search by only adding one off-the-shelf PCI board with one Field Programmable Gate Array (FPGA) to a Pentium based computer system in use. FPGA is a reconfigurable device, and any kind of circuits, such as pattern matching program, can be real- ized in a moment. The performance is almost proportional to the size of FPGA which is used in the system, and FPGAs are becoming larger and larger following Moore’s law. We can easily obtain latest/larger FPGAs in the form off-the-shelf PCI boards with FPGAs, at low costs. The result which we obtained is as fol- lows. The performance is most comparable with small to middle class dedicated hardware systems when we use a board with one of the latest FPGAs and the per- formance can be furthermore accelerated by using more number of FPGA boards. The time for comparing a query sequence of 2,048 elements with a database se- quence of 64 million elements by the Smith-Waterman algorithm is about 34 sec, which is about 330 times faster than a desktop computer with a 1GHz PentiumIII. We can also accelerate the performance of a laptop computer using a PC card with one smaller FPGA. The time for comparing a query sequence (1,024) with the database sequence (64 million) is about 185 sec, which is about 30 times faster than the desktop computer. 1 Introduction In the past several years, there has been a rapid increase in genetic and ge- nomic database, and the pattern matching problems in bioinformatics require huge time for the computations. Many algorithms 4,5,6 and dedicated hardware systems 11,12,13 have been developed. The result obtained there is a trade-off of quality, time and cost. With desktop computer systems, it is unrealistic to check all pattern matching possibilities within a reasonable time. Therefore, simplified (but still very effective) algorithms have been designed and used on the systems. With dedicated hardware systems, the computation time can be This work was supported by Grant-in-Aid for Scientific Research on Priority Areas (C) ”Genome Information Science” from the Ministry of Education, Culture, Sports, Science and Technology of Japan, and Japan Society for the Promotion of Science (JSPS) Research Fellowships for Young Scientists (#5304). Pacific Symposium on Biocomputing 7:271-282 (2002)