International Journal of Electrical and Computer Engineering (IJECE)
Vol. 8, No. 4, August 2018, pp. 2521~2530
ISSN: 2088-8708, DOI: 10.11591/ijece.v8i4.pp2521-2530 2521
Journal homepage: http://iaescore.com/journals/index.php/IJECE
Impact of Packet Inter-arrival Time Features for Online
Peer-to-Peer (P2P) Classification
Bushra Mohammed Ali Abdalla
1
, Mosab Hamdan
2
, Mohammed Sultan Mohammed
3
,
Joseph Stephen Bassi
4
, Ismahani Ismail
5
, Muhammad Nadzir Marsono
6
1,2,3,5,6
Department of Electronic and Computer Engineering, Faculty of Electronic Engineering,
Universiti Teknologi Malaysia, 81310, Johor Bahru, Malaysia
4
Department of Computer Engineering, Faculty of Engineering, University of Maiduguri, Borno state, Nigeria
Article Info ABSTRACT
Article history:
Received Apr 12, 2018
Revised Jul 20, 2018
Accepted Jul 26, 2018
Identification of bandwidth-heavy Internet traffic is important for network
administrators to throttle high-bandwidth application traffic. Flow features
based classification have been previously proposed as promising method to
identify Internet traffic based on packet statistical features. The selection of
statistical features plays an important role for accurate and timely
classification. In this work, we investigate the impact of packet inter-arrival
time feature for online P2P classification in terms of accuracy, Kappa
statistic and time. Simulations were conducted using available traces from
University of Brescia, University of Aalborg and University of Cambridge.
Experimental results show that the inclusion of inter-arrival time (IAT) as an
online feature increases simulation time and decreases classification accuracy
and Kappa statistic.
Keyword:
Features selection
Machine learning
Online features
P2P
Copyright © 2018 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
Muhammad Nadzir Marsono,
Department of Electronic and Computer Engineering,
Faculty of Electronic Engineering,
Universiti Teknologi Malaysia,
81310, Johor Bahru, Malaysia.
Email: nadzir@fke.utm.my
1. INTRODUCTION
Today, peer-to-peer (P2P) is as an architecture for sharing a wide range of media on the Internet.
P2P traffic represents about 27% to 60% of the total Internet traffic, depending on geographic location [1],
[2]. The high volume of P2P traffic is due to file sharing, video streaming, on-line gaming and other activities
that client-server architecture cannot accomplish as fast or as efficient as the P2P architecture. Rapid
progression of P2P traffic volume throughout the years have resulted in deteriorated network performance
and congestion due to the high bandwidth consumption of P2P applications [3]. Therefore, traffic
identification is required to improve traffic management.
First generation P2P application traffic were relatively easy to be identified due to the use of fixed
ports numbers. However, current P2P applications are able to circumvent port-based identification by using
anonymous port numbers or port disguise [4], [2]. Besides, methods that rely on inspecting application
payload signatures have also been proposed [5]. For privacy and impractical reasons, this method is
ineffective. The effectiveness of the port-based and payload-based methods prompted the use of flow
statistics as features for traffic identification. These strategies offer flexibility to detect P2P traffic compared
to using signature-based and port-based methods.
Several techniques have been proposed over the last two decades that focused on the attainable
identification accuracy using several machine learning (ML) algorithms. However, the impact of exploring
the effect of distinct sets of statistical features has not been researched in-depth. Work in [6] has reported that