Proceedings of the 9 th INDIACom; INDIACom-2015 2015 2 nd International Conference on “Computing for Sustainable Global Development”, 11 th – 13 th March, 2015 Bharati Vidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi (INDIA) Real Time Monitoring and Analysis of Available Bandwidth in Cellular Network-using Big Data Analytics Arjun Sahni DAV Institute of Engg.&Technology Jalandhar, Punjab, India Email Id: arjun.19.sahni@gmail.com, Divyanshu Marwah DAV Institute of Engg.&Technology Jalandhar, Punjab, India Email Id: divyanshu.marwah@hotmail.com Dr. Raman Chadha Professor, Head(C.S.E) CGCT, Jhanjeri, Mohali Email Id:dr.ramanchadha@gmail.com Abstract - Data in the digital universe is increasing at an exponential rate. The increasing access to the Internet and the greater network bandwidth give cause for big amounts of data, which need means of knowing the bandwidth and link capacity estimates in order to analyze the current network status. The proper operation and maintenance of a network requires a reliable and efficient monitoring mechanism, which should handle large amount of monitoring data. The design is not limited to a particular monitoring protocol, since it employs a generic structure for data handling. Hence, it’s applicable to a wide variety of monitoring solutions. Keywords - Network monitoring, end-to-end monitoring, Big Data, bandwidth analysis, real-time bandwidth monitoring. I. INTRODUCTION The data on the internet is increasing at an exponential rate. From 160 Exabyte in 2006[1] to over 2.7 zeta bytes in 2012.[2] There is a huge amount of data put on the internet and downloaded. Internet has become of a tremendous importance now. It is widely used, all over the world for innumerable applications, and is seen by many as a tool for research. As the Internet usage increases, providers are offering services that are more extensive, to the users as well as corporations. There is one thing that all end-users demand in common – quality and speed of transmission. The network topologies that are used are not uniformly distributed i.e. different users have different access to Internet, at different speeds. For example – a number of users share a common Internet connection. When the traffic increases because of high usage, there are possibilities of network congestion. So, this bandwidth congestion will have a negative impact on data transmission rates and the overall quality at independent destinations. So, it is important to actively measure and analyze the bandwidth to avoid all the negative effects. First the bandwidth is analyzed using the BART estimation technique[3]. The real time data is recorded in a NoSQL data storage. It is then analyzed and visualized in R and conclusions are drawn from it. II. DATA COLLECTION A. Basic Network Management Theory Estimation of bandwidth is a non-trivial network performance measurement technique particularly in large and high-speed networks, where it is difficult to obtain accuracy. That is partly due to the number of existing bandwidth-related metrics: Capacity, Available bandwidth, Bulk Transfer- Capacity (BTC) and achievable TCP Throughput et cetera. On the other hand, ISPs use metrics like latency (propagation delay, one–way delay, round-trip time (RTT), queuing delay), packet loss, TCP throughput, link utilization and availability to measure bandwidth. State-of-the-art bandwidth estimation tools such us Pathload [4], Pathchirp [5], TOPP [6] or BART [3] measure and analyze some these metrics to make accurate bandwidth measurements by only applying traditional best efforts and services. B. Measurement Methods Network management involves two methods or techniques: active measurement method and passive measurement method. C. Passive Method Systems that follow passive measurement technique make an effort to measure performance of packets by monitoring the network traffic, without making any modifications.[7] Unlike active method, no probe packets are injected into the network. Rather, they only depend on surveillance of user-generated traffic, performed at network devices (e.g. servers, switches and routers). Collecting Simple Network Management Protocol (SNMP) [8] data is one of the passive techniques. SNMP data provides switch-level and router-level information