Please cite this article in press as: Lee, C.-H., et al., Effective processing of continuous group-by aggregate queries in sensor networks. J. Syst.
Software (2010), doi:10.1016/j.jss.2010.08.049
ARTICLE IN PRESS
G Model
JSS-8576; No. of Pages 15
The Journal of Systems and Software xxx (2010) xxx–xxx
Contents lists available at ScienceDirect
The Journal of Systems and Software
journal homepage: www.elsevier.com/locate/jss
Effective processing of continuous group-by aggregate queries in sensor networks
Chun-Hee Lee
a
, Chin-Wan Chung
a,∗
, Seok-Ju Chun
b
a
Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 305-701, Republic of Korea
b
Department of Computer Education, Seoul National University of Education, Seoul 137-742, Republic of Korea
article info
Article history:
Received 16 April 2009
Received in revised form 18 June 2010
Accepted 18 August 2010
Available online xxx
Keywords:
Sensor network
Group-by aggregate query
Haar wavelet
Two-phase collection
abstract
Aggregate queries are one of the most important queries in sensor networks. Especially, group-by aggre-
gate queries can be used in various sensor network applications such as tracking, monitoring, and event
detection. However, most research has focused on aggregate queries without a group-by clause.
In this paper, we propose a framework, called the G-Framework, to effectively process continuous
group-by aggregate queries in the environment where sensors are grouped by the geographical location.
In the G-Framework, we can perform energy effective data aggregate processing and dissemination using
two-dimensional Haar wavelets. Also, to process continuous group-by aggregate queries with a HAVING
clause, we divide data collection into two phases. We send only non-filtered data in the first collection
phase, and send data requested by the leader node in the second collection phase. Experimental results
show that the G-Framework can process continuous group-by aggregate queries effectively in terms of
energy consumption.
© 2010 Elsevier Inc. All rights reserved.
1. Introduction
Sensor networks consist of small sensors which have comput-
ing and communication facilities. With the advancement of sensor
technology, sensors are becoming smaller and more powerful.
Moreover, as the price of a sensor becomes low, we expect that
a large number of sensors will be used in various sensor network
applications.
For example, a volcanologist can use a sensor network to mon-
itor a dangerous active volcanic area. Low-priced sensors can be
scattered over the dangerous area from an airplane. Such sen-
sors become a sensor network and monitor the volcano without
humans’ help. However, sensors have very limited resources (e.g.,
memory, computation, communication and energy). Among vari-
ous resources, energy is one of the very important resources since
the battery replacement is difficult or impossible in such environ-
ments. In sensor networks, since individual sensor readings are raw
data, there are many applications using aggregate values. In many
cases, the aggregate values of many regional areas are preferred to
the aggregate value of the whole area since the aggregate value of
the whole area does not provide the detailed information. That is,
group-by aggregate queries are useful in sensor networks. There-
fore, in this paper, we consider continuous group-by aggregate
queries. Due to many shortcomings of the current technology, it is
difficult to manage a large number of sensors. As one of the effective
∗
Corresponding author. Tel.: +82 42 350 3537; fax: +82 42 350 7737.
E-mail addresses: leechun@islab.kaist.ac.kr (C.-H. Lee), chungcw@kaist.edu
(C.-W. Chung), chunsj@snue.ac.kr (S.-J. Chun).
methods to deal with many sensors, we can use clustering in sensor
networks (Heinzelman et al., 2002; Younis and Fahmy, 2004). Since
sensor readings have spatial correlations, spatial clustering of sen-
sors has many benefits. Therefore, we deal with group-by aggregate
queries in the environment where sensors are grouped (clustered)
by the geographical location. A group-by aggregate query may have
a HAVING clause which is a predicate for the aggregation of the
group. The queries we consider in this paper are shown in Fig. 1.
However, we focus on the query in Fig. 2(a) since processing of
queries in Fig. 1 can be extended from the processing of the query
in Fig. 2(a). Also, the G-Framework can process local predicates in
a straightforward method. Each node checks whether sensor read-
ings satisfy local predicates and makes the bitmap. Then, the node
sends only the satisfied data and the bitmap. Therefore, we will not
mention local predicates in this paper for convenience of explana-
tion.
Many papers proposed the processing of aggregate queries
(Madden et al., 2002; Fan et al., 2002; Considine et al., 2004; Nath et
al., 2004; Shrivastava et al., 2004; Deligiannakis et al., 2004; Sharaf
et al., 2003, 2004). However, most of them do not consider group-
by aggregate queries. Although some papers deal with processing
group-by aggregate queries, they do not focus on processing group-
by aggregate queries by the geographical location. In this paper, we
focus on processing those queries. They can be used in many sen-
sor networks applications such as tracking, monitoring, and event
detection. To process them, we assume the following:
•
Sensors are grouped according to the geographical location. See
Fig. 2(b). A group consists of a leader node and member nodes.
A leader node and member nodes are connected in one hop (the
0164-1212/$ – see front matter © 2010 Elsevier Inc. All rights reserved.
doi:10.1016/j.jss.2010.08.049