Round Robin Cycle for Predictions in Wireless Sensor Networks # Le Borgne Yann-Ael, Bontempi Gianluca Machine Learning Group, Universite Libre de Bruxelles CP 212, Bd Triomphe 1050 Bruxelles - Belgium {yleborgn,gbonte}@ulb.ac.be Abstract Use of prediction models in sensor networks proves to be efficient with respect to energy savings, as it allows sensors whose readings are predicted to remain in their idle mode, thereby consuming orders of magnitude less energy that in the active mode. In the context of continuous monitoring, where a set of sensors is typically required to regularly send their readings to a central server, an interesting approach consists in splitting the set of sensors in two subsets, such that readings of one subset are used to predict readings of the second subset. In this paper, we propose to identify several sensor subsets for predictions, that are used in turn in a round robin fashion. Identification of different sensor subsets allows to detect erroneous models or sensor failure, and to better distribute energy consumption. Efficiency of the proposed procedure is demonstrated on a set of experiments using real world sensor data. I NTRODUCTION Many applications of wireless sensor networks require a large set of sensors (potentially hundreds or thousands) to regularly report their readings to a central server to undertake off-line data analysis. These applications, for instance environmental or structural monitoring, HVAC systems, or object tracking, also require the sensors to be running for as long a period of time as possible. Limited energy ressources available on a sensor module (a.k.a. mote) drive the need for efficient scheduling of sensing and networking activities, so as to maximize the application lifetime. On a mote, efficient energy consumption is achieved by switching to a sleeping mode components not in use. Dif- ferent operating modes allow the mote to switch on and off components such as micro-controller unit (MCU), sensors, flash memory or radio. Radio transmission and reception are known to be the dominant factors of energy consumption on a mote [1], as their energy cost is at least an order of magnitude higher than the sole use of the MCU. Many strategies for in-network compression have been suggested in the litterature to decrease use of radio communication, such as distributed source coding, routing compression [7], or cluster- based aggregation [4]. These compression methods entail energy savings by reducing the communication time between motes. However, these techniques still need the sensors to collect measures at regular intervals, and so even though overall radio communication activity is reduced, motes still consume energy by keeping their MCU and sensors ON when data is sensed from the environment. To further decrease energy consumption, it has recently been proposed to use a model driven approach to reduce the number of sensors solicited in a sensing task [2]. In this approach, a subset of sensors is identified (the prediction subset), from which readings of remaining sensors (the predicted subset) can be predicted within a user specified error threshold and con- fidence level. This offers optimal energy savings for sensors not solicited, as in current technology OFF operation mode, energy consumption is three orders of magnitude less than when the MCU is ON, and four orders of magnitude less than when the MCU and the radio are ON [1]. Two undesirable consequences however follow from the use of a single prediction subset: 1) Erroneous models: If dependencies between sensor readings change over time, prediction models will be- come outdated, entailing a possibly rapid deterioration in predicted sensor values. 2) Unequal energy distribution: Continuous solicitation of the same sensor subset will lead to the energy depletion of these sensors. The first issue is related to the fact that the data distribution of the readings generated by a sensor network can not be assumed to be stationary. The data set used to select a prediction subset may not be consistent with subsequent measurements of the sensor network. Strategies to detect sensor malfunctioning and concept drift must therefore be implemented to avoid erro- neous predictions. The second issue stems from the repeated use of the same subset of sensors. In this article, we propose relying on a round robin system to adress the above mentionned issues. The principle consists in designing a cycle such that a different prediction subset is used at each step, and such that all sensors are at least queried once during a cycle. This scheme allows to both detect faulty sensors or changes in the dependencies between sensors by keeping on collecting measures from all sensors. Moreover, by taking into account estimates of sensor remaining energy, the cycle can be defined so as to minimize use of sensors