P.M.A. Sloot et al. (Eds.): ICCS 2003, LNCS 2658, pp. 981–990, 2003. © Springer-Verlag Berlin Heidelberg 2003 Parallel DNA Sequence Alignment Using a DSM System in a Cluster of Workstations Renata Cristina Faray Melo, Maria Emília Telles Walter, Alba Cristina Magalhaes Alves de Melo, and Rodolfo B. Batista Department of Computer Science, Campus Universitario - Asa Norte, Caixa Postal 4466, University of Brasilia, Brasilia – DF, CEP 70910-900, Brazil {renata, mia, albamm, rodolfo}@cic.unb.br Abstract. Distributed Shared Memory systems allow the use of the shared memory programming paradigm in distributed architectures where no physically shared memory exist. Scope consistent software DSMs provide a relaxed memory model that reduces the coherence overhead by ensuring consistency only at synchronisation operations, on a per-lock basis. Much of the work in DSM systems is validated by benchmarks and there are only a few examples of real parallel applications running on DSM systems. Sequence comparison is a basic operation in DNA sequencing projects, and most of sequence comparison methods used are based on heuristics, that are faster but do not produce optimal alignments. Recently, many organisms had their DNA entirely sequenced, and this reality presents the need for comparing long DNA sequences, which is a challenging task due to its high demands for computational power and memory. In this article, we present and evaluate a parallelisation strategy for implementing a sequence alignment algorithm for long sequences. This strategy was implemented in JIAJIA, a scope consistent software DSM system. Our results on an eight-machine cluster presented good speedups, showing that our parallelisation strategy and programming support were appropriate. 1 Introduction In order to make shared memory programming possible in distributed architectures, a shared memory abstraction must be created. This abstraction is called Distributed Shared Memory (DSM). The first DSM systems tried to give parallel programmers the same guarantees they had when programming uniprocessors. It has been observed that providing such a strong memory consistency model creates a huge coherence overhead, slowing down the parallel application and bringing frequently the system into a thrashing state[13]. To alleviate this problem, researchers have proposed to relax some consistency conditions, thus creating new shared memory behaviours that are different from the traditional uniprocessor one. In the shared memory programming paradigm, synchronisation operations must be used every time processes want to restrict the order in which memory operations should be performed. Using this fact, hybrid Memory Consistency Models guarantee that processors only have a consistent view of the shared memory at synchronisation