Journal of Computer Science 4 (5): 393-401, 2008 ISSN 1549-3636 © 2008 Science Publications Corresponding Author: Amjad Hudaib, Department of Computer Information Systems, University of Jordan, Amman 11942, Jordan Tel.: +962-5355000/ext: 22610 Fax: +962-5354070 393 A Fast Pattern Matching Algorithm with Two Sliding Windows (TSW) Amjad Hudaib, Rola Al-Khalid, Dima Suleiman, Mariam Itriq and Aseel Al-Anani Department of Computer Information Systems, University of Jordan, Amman 11942 Jordan Abstract: In this research, we propose a fast pattern matching algorithm: The Two Sliding Windows (TSW) algorithm. The algorithm makes use of two sliding windows, each window has a size that is equal to the pattern length. Both windows slide in parallel over the text until the first occurrence of the pattern is found or until both windows reach the middle of the text. The experimental results show that TSW algorithm is superior to other algorithms especially when the pattern occurs at the end of the text. Key words: Pattern matching, string matching, berry-ravindran algorithm, boyer moore INTRODUCTION Pattern matching is a pivotal theme in computer research because of its relevance to various applications such as web search engines, computational biology, virus scan software, network security and text processing [1-4] . Pattern matching focuses on finding the occurrences of a particular pattern P of length ‘m’ in a text ‘T’ of length ‘n’. Both the pattern and the text are built over a finite alphabet set called of size σ. Generally, pattern matching algorithms make use of a single window whose size is equal to the pattern length [5] . The searching process starts by aligning the pattern to the left end of the text and then the corresponding characters from the pattern and the text are compared. Character comparisons continue until a whole match is found or a mismatch occurs, in either case the window is shifted to the right in a certain distance [6-12] . The shift value, the direction of the sliding window and the order in which comparisons are made varies in different pattern matching algorithms. Some pattern matching algorithms concentrate on the pattern itself [5] . Other algorithms compare the corresponding characters of the pattern and the text from left to right [6] . Others perform character comparisons from right to left [8,11] . The performance of the algorithms can be enhanced when comparisons are done in a specific order [9,13] . In some algorithms the order of comparisons is irrelevant such as Brute Force and Horspool algorithms [7] . In this study, we propose a new pattern matching algorithm: The Two Sliding Windows algorithm (TSW). The algorithm concentrates on both the pattern and the text. It makes use of two windows of size that is equal to the size of the pattern. The first window is aligned with the left end of the text while, the second window is aligned with the right end of the text. Both windows slide at the same time (in parallel) over the text in the searching phase to locate the pattern. The windows slide towards each other until the first occurrence of the pattern from either side in the text is found or they reach the middle of the text. If required, all the occurrences of the pattern in the text can be found. Related works: Several pattern matching algorithms have been developed with a view to enhance the searching processes by minimizing the number of comparisons performed [14-16] . To reduce the number of comparisons, the matching process is usually divided into two phases. The pre-processing phase and the searching phase. The pre-processing phase determines the distance (shift value) that the pattern window will move. The searching phase uses this shift value while searching for the pattern in the text with as minimum character comparisons as possible. In Brute Force algorithm (BF), no pre-processing phase is performed. It compares the pattern with the text from left to right. After each attempt, it shifts the pattern by exactly one position to the right. The time complexity of the searching phase is O (mn) in the worst case and the expected number of text character comparisons is (2n). New ways to reduce the number of comparisons performed by moving the pattern more than one position are proposed by many algorithms such as Boyer-Moore (BM) [11,17] and Knuth-Morris-Pratt algorithms (KMP) [6,18] .