This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE SYSTEMS JOURNAL 1 A Strategy for Elimination of Data Redundancy in Internet of Things (IoT) Based Wireless Sensor Network (WSN) Shishupal Kumar and Vijay Kumar Chaurasiya Abstract—In order to give a complete description of an environ- ment or to make a robust decision, a number of observations must be collected and combined from multiple sensor nodes. In these large collections of data, only some are useful, whereas others are redundant. This redundancy decreases performance in terms of computing overhead, excessive transmission, and covering a large space. The process of selecting and analyzing the useful informa- tion from the collection of sensed data is called mining. Mining is used to produce more consistent, accurate, and useful information than that provided by any individual sensor node. Data mining has been widely applied in many areas, such as object recognition, wireless sensor networks (WSNs), image processing, environment mapping, and localization. Nowadays, Internet of Things utilizes WSN as a necessary platform for sensing and communication of the data. For efficiency, mining of spatial and temporal data is per- formed on the sensed sample collected by sensor nodes. Therefore, in this paper, a redundancy removal strategy is proposed, which performs mining on collected data to select the appropriate infor- mation before forwarding to a base station or a cluster head in the WSN. Extensive simulations were conducted, and the related results showed that the proposed scheme had better performance compared to other schemes in our simulated scenarios. Index Terms—Data mining, Internet of Things (IoT), perfor- mance analysis, wireless sensor network (WSN). I. INTRODUCTION A WIRELESS sensor network (WSN) contains a large num- ber of nodes having sensing capability to easily detect any changes in the surrounding real-world environment. The nodes in a WSN are used to carry sensed information from one lo- cation to another desired position for further processing [1]. As technological advances are daily being developed in this re- gard, the WSN plays a huge role by providing communication to smart devices. These smart devices communicate at differ- ent locations by providing a level of transparency among users maintained within an interconnected smart network. It consti- tutes a number of sensor nodes, which are used to send sensed information and termed as Internet of Things (IoT) [2]. In this paper, the network is referred to as an IoT-oriented WSN. In this Manuscript received December 29, 2017; revised April 12, 2018, July 10, 2018, and September 11, 2018; accepted September 23, 2018. (Corresponding author: Shishupal Kumar.) The authors are with the Department of Information Technology, Indian Institute of Information Technology Allahabad, Allahabad 211015, India (e-mail:, rsi2016506@iiita.ac.in; vijayk@iiita.ac.in). Digital Object Identifier 10.1109/JSYST.2018.2873591 network, the users communicate with each other by exchanging sensed data, monitoring events/surrounding, and reacting au- tonomously. Nowadays, the world is seeing a revolution in the services and management industries. This revolution is essential for automation through data mining and learning. IoT-oriented WSN services are provided through a standard interface to en- able users to create a query, retrieve information, and change their states accordingly. An Internet link provides the standard interface between users and IoT devices [3]. However, an IoT-oriented WSN is an energy constrained net- work; hence, various aspects have to be considered to transmit data from each node to the destination (sink node) [4]. These var- ious aspects could be battery power consumption, bandwidth, processing capability, storage capacity, etc. The lifetime of an IoT-oriented WSN is reduced when the data packet is transmit- ted separately from each sensor node toward the cluster head or base station [5] . In this way, wastage of both battery and band- width could take place. To overcome this issue, a new approach of mining techniques has been anticipated. Mining is the process of selecting important and useful data from the sensed informa- tion and observations from multiple sensor nodes. It provides an effective information into one copy, which is able to meet the user needs in middle sensor nodes [6]. The data mining can be accomplished on the data collected by sensor nodes in two customs [7]: spatial and temporal. In a spa- tial way, typical WSN applications require information or data from spatially deployed dense sensor nodes in order to achieve satisfactory coverage of content gathering. As a result, multiple sensors record information about a single event in the sensor field. Due to high density in the network topology, spatially proximal sensor observations are highly correlated with decreas- ing inter-node separation. However, in a temporal way, some of the WSN applications such as event tracking may require sensor nodes to periodically perform observation and transmission of the sensed event features with related information [8]. In this paper, we assume that the process of data aggregation is per- formed by a cluster head sensor node. Clustering is an operative approach to diminish energy depletion. Cluster-based protocols fragment a network into non-overlapping clusters, each encom- passing a cluster head and deeds as a gateway between affiliates and the base station (sink). For performance analysis, the comparison of the proposed novel data mining (NDM) strategy is made with the weighted 1937-9234 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.