Enhancing the performance of decision tree-based packet classification algorithms using CPU cluster Mahdi Abbasi 1 • Aazad Shokrollahi 1 Received: 27 June 2019 / Revised: 8 January 2020 / Accepted: 29 February 2020 Ó Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract Packet classiﬁcation is a essential process in network processors. In this process, the incoming packets are matched against a set of ﬁlters and divided into speciﬁed streams. Classiﬁcation methods are either software-based or hardware-based. Despite hardware-based methods, software-based methods are more ﬂexible and have a lower cost. This paper reports on an experiment in which the Hierarchical-trie (H-trie) algorithm, which is a software-based method, was for the ﬁrst time parallelized using the CPU cluster. The characteristic of this algorithm is building a decision tree with the least memory usage and search complexity. We implemented and executed different scenarios by using MPI and OpenMP and combining them in a system with a single multi-core processor as well as multi-core processor clusters. Our results suggest that an increase in the number of processor cores would linearly increase the speed of classiﬁcation. Moreover, MPI uses more memory than OpenMP but provides a higher rate of classiﬁcation. The results of the combined method show that, if the number of processes and threads are equal to the number of processor cores, the maximum speed of packet classiﬁcation can be achieved. Also, the least classiﬁcation time and memory usage can be achieved when the sum of processes and threads do not outnumber CPU cores. Keywords OpenMP Á MPI Á Packet classiﬁcation Á H-trie algorithm Á CPU cluster 1 Introduction Today, there is an urgent need for more efﬁcient networks due to the increased number of users, increased commu- nication trafﬁc, and the emergence of new areas of use, such as multimedia data. Consequently, network designers search for solutions to increase the speed and efﬁciency of network processes. One solution is to increase the speed of communication lines. To achieve an acceptable level of efﬁciency in network processes, routers and switches must also be able to keep up with the speed of network com- munication [1–4]. In addition to routing and processing, quality of services entails certain operations on the packets so that routers would need new mechanisms for reserving resources, creating queues, and fair scheduling to offer better services to users. A prerequisite for this is the efﬁcient classiﬁcation of packets into different streams for processing. Classiﬁcation methods are hardware-based or software- based. The major problem of hardware-based methods is their limitation in using memories that can be used in parallel searching. Furthermore, they have high cost and power consumption [5–7]. Therefore, they are only suit- able for classiﬁcations with a small number of ﬁlters. Software-based methods have recently come to the fore [6, 8, 9]. These methods are very ﬂexible and carry a lower cost. Much research has been conducted into software- based methods. Most studies have attempted to reduce the time and number of memory accesses by using speciﬁc techniques for designing algorithms and data structures [10–13]. Taylor et al. [14], offer a general categorization of classiﬁcation methods, including the linear search, decomposition, tuple space, and decision tree. The fol- lowing is a brief description of each algorithm. Linear search: Filters are arranged according to their priority. The incoming packet is compared with the ﬁlters & Mahdi Abbasi abbasi@basu.ac.ir 1 Department of Computer Engineering, Engineering Faculty, Bu-Ali Sina University, Hamedan, Iran 123 Cluster Computing https://doi.org/10.1007/s10586-020-03081-7