Enhancing the performance of decision tree-based packet classification algorithms using CPU cluster Mahdi Abbasi 1 • Aazad Shokrollahi 1 Received: 27 June 2019 / Revised: 8 January 2020 / Accepted: 29 February 2020 Ó Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract Packet classification is a essential process in network processors. In this process, the incoming packets are matched against a set of filters and divided into specified streams. Classification methods are either software-based or hardware-based. Despite hardware-based methods, software-based methods are more flexible and have a lower cost. This paper reports on an experiment in which the Hierarchical-trie (H-trie) algorithm, which is a software-based method, was for the first time parallelized using the CPU cluster. The characteristic of this algorithm is building a decision tree with the least memory usage and search complexity. We implemented and executed different scenarios by using MPI and OpenMP and combining them in a system with a single multi-core processor as well as multi-core processor clusters. Our results suggest that an increase in the number of processor cores would linearly increase the speed of classification. Moreover, MPI uses more memory than OpenMP but provides a higher rate of classification. The results of the combined method show that, if the number of processes and threads are equal to the number of processor cores, the maximum speed of packet classification can be achieved. Also, the least classification time and memory usage can be achieved when the sum of processes and threads do not outnumber CPU cores. Keywords OpenMP Á MPI Á Packet classification Á H-trie algorithm Á CPU cluster 1 Introduction Today, there is an urgent need for more efficient networks due to the increased number of users, increased commu- nication traffic, and the emergence of new areas of use, such as multimedia data. Consequently, network designers search for solutions to increase the speed and efficiency of network processes. One solution is to increase the speed of communication lines. To achieve an acceptable level of efficiency in network processes, routers and switches must also be able to keep up with the speed of network com- munication [1–4]. In addition to routing and processing, quality of services entails certain operations on the packets so that routers would need new mechanisms for reserving resources, creating queues, and fair scheduling to offer better services to users. A prerequisite for this is the efficient classification of packets into different streams for processing. Classification methods are hardware-based or software- based. The major problem of hardware-based methods is their limitation in using memories that can be used in parallel searching. Furthermore, they have high cost and power consumption [5–7]. Therefore, they are only suit- able for classifications with a small number of filters. Software-based methods have recently come to the fore [6, 8, 9]. These methods are very flexible and carry a lower cost. Much research has been conducted into software- based methods. Most studies have attempted to reduce the time and number of memory accesses by using specific techniques for designing algorithms and data structures [10–13]. Taylor et al. [14], offer a general categorization of classification methods, including the linear search, decomposition, tuple space, and decision tree. The fol- lowing is a brief description of each algorithm. Linear search: Filters are arranged according to their priority. The incoming packet is compared with the filters & Mahdi Abbasi abbasi@basu.ac.ir 1 Department of Computer Engineering, Engineering Faculty, Bu-Ali Sina University, Hamedan, Iran 123 Cluster Computing https://doi.org/10.1007/s10586-020-03081-7