ISSN: 1220-1766 eISSN: 1841-429X 365 ICI Bucharest © Copyright 2012-2017. All rights reserved 1. Introduction Nowadays, the Internet of Things (IoTs) is growing quickly as a subset of big data. Billions of recent physical devices, such as smart devices [4] and Wireless sensing Sensor Networks (WSNs) area unit [15] are expected to be connected in the near future. WSNs are available in various applications and services, mostly organizations, including public and private, especially in the medical field and health care Therefore, the data gathered and collected from the WSNs are considered to be a great source of big data. With the recent advancements in communication technology, more and more data are generated and collected, therefore, the big data will grow exponentially and this will increase the challenges of extracting and retrieving the complexity of the valuable hidden data. There are more than three billion users of smart objects including smart phones, smart homes, as well as business and entertainment applications [16]. These smart devices allow Machine to Machine (M2M) electronic communication with or without an intermediary-user. This has led to what is known as the “Internet of Things (IoTs) “[8]. The huge amount of data generation has been useful in various felds such as commercial, industrial, scientifc, social and medical [11], as shown in Figure 1. Big data is a collection of very huge datasets with a great diversity of types so that it becomes difficult to process by using state-of-the-art data processing approaches or traditional data processing platforms such as Processing Big A Big Data Framework for Mining Sensor Data Using Hadoop Engy A. EL-SHAFEIY*, Ali I. EL-DESOUKY Computers and Systems Department, Faculty of Engineering, Mansoura University, Egypt (*Corresponding author) e-mail: engy.elshafeiy@gmail.com. Abstract: The data gathered from IOTs is considered of high business value. The IOTs devices sense the natural conditions using sensor network comprised of sensor nodes. Mining of big sensor data for useful knowledge extraction is a very challenging task. Frequent itemsets is one of the most effective mining techniques that fnd important itemsets from big sensor data. In this paper, a MapReduce Frequent Nodesets-based Boundary POC tree (MR-FNBP) framework is proposed for mining Frequent Nodesets for big sensor data. The MapReduce framework is used to implement MR-FNBP to enhance its performance in highly distributed environments. Additionally, the proposed Boundary (FNBP) creates a Boundary as an early stage to exclude the infrequent itemsets, and this may reduce the overall memory and time usage. Moreover, a number of experiments were performed to evaluate the performance of MR-FNBP framework. The results show high scalability rate and a less time consuming process for MR-FNBP framework over different recent systems. Keywords: Big data, Internet of Things, MapReduce, Wireless Sensor Networks, Mining Frequent Nodesets. Trajectory Data [19]. In 2012, Gartner retrieved and gave a more detailed defnition as: Big data are high-volume, high-velocity, and/or high- variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. The main characteristic of Big data included the 3Vs characteristics (Veracity, Viability, and Value) and then was elaborated to include the following characteristics known as the 6Vs: Volume: Describes the huge data size. Velocity: Describes the data communication, processing speeds per time unit. Variety: Describes the different data types (structured, semi-structured, and unstructured). Value: Describes the valuable data knowledge Veracity: Describes the data quality, such as data cleaning, fltering. Viability : Describes the prediction possibilities. More generally, a dataset can be called big data if it is formidable to perform capture, analysis and visualization on it using current technology. With diversifed data provisions, such as sensor networks, telescopes, scientifc experiments, and high throughput instruments, the datasets increase at exponential rate [18]. Other Big data applications lie in many scientifc disciplines such as astronomy, atmospheric science, medicine, genomics, biologic, biogeochemistry and other complex and interdisciplinary scientifc researches. Web-based Studies in Informatics and Control 26 (3) 365-376, September 2017 https://doi.org/10.24846/v26i3y201712