FPGA-based Heterogeneous Architecture for Sequence Alignment Xin Chang Fernando A. Escobar Carlos Valderrama Service d’électronique et de Microélectronique Faculté Polytechnique de Mons Université de Mons,7000, Belgium ABSTRACT With the rapid development of genome sequencing technol- ogy, the cost of getting genome data is becoming decreas- ingly insigniﬁcant. However, the computational speed for analysis genome data remains same. The bioinformatics community is facing a serious challenge to deal with massive data. In this paper, we proposed a novel heterogeneous ar- chitecture for sequence alignment. As will be demonstrated, the speed of sequence alignment can be improved with rea- sonable resources utilization on programmable logic. Keywords Sequence Alignment,Smith-Waterman,Heterogeneous Archi- tecture,FPGA,Systolic Array; 1. INTRODUCTION Nowadays bioinformatics plays an essential role in process- ing genomic, medical and proteomic data generated by high- throughput technologies. Genome sequencing is, in particu- lar, an emerging technique widely used. With the develop- ment or the novel technologies such as PCR[1] sequencing, the cost and time of genome sequencing have dropped dra- matically over the last decade. Sequence alignment analyzes similarities between DNA, or protein sequences, to assess the genetic relationship between organisms or species. It helps scientists to check pathogenic mutations, understand the evolution of creatures and pre- dict the structure of organisms. It is widely used in the bioinformatics, drugs and medicine design among other re- lated areas. However, the rate of genome data generation exceeds the speed at which it can be computationally pro- cessed. In addition, the databases of genome sequences are spreading and becoming large-scale. These are reasons why the acceleration of genome sequence alignment has become an emerging bioinformatics activity. This paper is going to propose a novel FPGA-based het- erogeneous architecture for sequences alignment. As will be demonstrated, the proposed architecture outperforms state- of-art approaches in terms of speed thanks to an optimized hardware and software partition. In the next section, cur- rent solutions for sequence alignment and background of Smith-Waterman algorithm will be introduced. Section III will analyse our optimization target, the Smith-Waterman algorithm, highlighting the bottlenecks of Smith-Waterman algorithm and optimization strategies. Section IV will de- ﬁne the proposed implementation architecture. Preliminary evaluation results will be shown in Section V. The last sec- tion will conclude and present the future works. 2. BACKGROUND A broad set of genome sequence alignment algorithms are available. They can be classiﬁed according to the number of sequences that will be processed simultaneously in pair- wise and in multiple sequence alignment. Probcons[2], T- Coﬀee[3] are known as the most commonly used algorithms for multiple sequence alignment. Nevertheless, most of mul- tiple sequence approaches are based on an extended version of pairwise sequence alignment algorithms. Regarding pair- wise sequence alignment algorithms, the Smith-Waterman, BLAST[4] and HMMER[5] are the most representative of this category. Due to its accuracy, The Smith-Waterman al- gorithm is one of the most widely used sequence alignment algorithm. Indeed, it is even used to reﬁne the results of other less accurate. However, because it is also a computa- tionally intensive algorithm; it requires more time compared to others. These are the reasons why we are particularly in- terested in the Smith-Waterman algorithm. There exist several implementation architectures, from classic multi-cores to Special Application Speciﬁc Integrated Circuits (ASICs) [6], proposed to eﬀectively accelerate the Smith-Waterman algorithm. The ASIC is able to provide a signiﬁcantly low cost and low power consumption, but with limited scalability and not negligible design and fabrication times. As alternative, recent approaches focus on Graphic Processing Units (GPUs) [7] and Field Programmable Gate Array (FPGA) [8][9]. GPU-based solutions provide mas- sive multi-threading for parallelize the Smith-Waterman al- gorithm. However, conﬂicting memory accesses, therefore sequential, become the bottleneck of the overall system. In 2012, K. Bankrid et al. explored the pros and cons of us- ing FPGAs in bioinformatics [10]. Their results revealed that, compared to other platforms, FPGAs are generally a cost eﬀective and energy eﬃcient solution as it comes on top of both, performance criteria per dollar and per Watt. There are already some researches using FPGA-based plat- forms to accelerate the Smith-Waterman such as in [8], [9], and [11]. In [8] and [11], the whole Smith-Waterman algo- rithm (including trace-back) was implemented on an FPGA platform. However, parts of the algorithm spent a large amount of FPGA resources without a clear performance im- provement. Moreover, in these designs, due to the sequential nature of certain tasks, such as the trace-back, the FPGA is not able to achieve the best performance of its inherent parallelism. In [9], the authors propose a heterogeneous FPGA-based