Bugs or Anomalies? Sequence Mining based
Debugging in Wireless Sensor Networks
Kefa Lu, Qing Cao, Michael Thomason
Department of Electrical Engineering and Computer Science
University of Tennessee, Knoxville, Tennessee 37919
Email: {klu3, cao, thomason}@utk.edu
Abstract—WSN applications are prone to bugs and failures
due to their typical characteristics, such as being extensively
distributed, heavily concurrent, and resource restricted. In this
paper, we propose and develop a flexible and iterative WSN de-
bugging system based on sequence mining techniques. At first, we
develop a data structure called the vectorized Probabilistic Suffix
Tree (vPST), an elastic model to extract and store sequential
information from program runtime traces in compact suffix tree
based vectors. Then, we build a novel WSN debugging system
by integrating vPST with Support Vector Machines (SVM), a
robust and generic classifier for both linear and nonlinear data
classification tasks. Finally, we demonstrate that the vPST-SVM
debugging system is efficient, flexible, and generic by three
different test cases, two on the LiteOS operating system and
one on the TinyOS operating system.
I. I NTRODUCTION
In the past decade, Wireless Sensor Networks (WSNs) have
been widely developed and deployed for various purposes,
such as environmental monitoring and data collection [1],
[2], [3]. However, WSN applications are still suffering from
numerous types of bugs and frequent failures [3], [4], due to
their typical characteristics, such as distributed architecture,
concurrent execution model, and strict resource limitations. It
is difficult to perform efficient debugging on WSN applica-
tions, because many of them are context sensitive and event
driven. It is usually infeasible to fully control their operating
context and triggering events. In addition, many WSN bugs
are transient and irreproducible [5]. Therefore, it becomes a
big challenge for current WSN researchers and developers to
design and develop robust WSN debugging systems.
In this paper, we design, implement, and evaluate a flexible
and generic debugging system based on sequential data analy-
sis and outlier detection techniques. Our approach is based on
two theoretical models, the vectorized Probabilistic Suffix Tree
(vPST) model and the Support Vector Machine (SVM) model.
The original PST model is a flexible probabilistic model that
can efficiently extract and store sequential information from
sequences in compact suffix tree data structures [6], while
SVM is a robust and generic classification technique that
can solve both linear and nonlinear classification problems
[7]. By extending PST to vPST, we are able to not only
retain the sequential information but also the most significant
substructures within sequences in compact and simple vectors.
SVM can be easily applied on these vectors to detect outliers
in the sequences. By combining the vPST model and the SVM
classifier together with an efficient tracing subsystem, we find
that the resulting technique proves to be immensely effective
to locate real bugs.
Our contributions in this paper are two-fold. First, we
extend the PST model to the vPST model, which provides
researchers a new methodology for extracting and analyzing
sequential information. Specifically, the vPST model breaks
sequences into pieces and stores them in meaningful data
structures. Second, we propose the vPST-SVM system, which
is especially helpful for detecting transient bugs. The whole
debugging process is iterative, meaning that it allows the
user to adjust debugging settings dynamically to achieve the
best results. The whole system is evaluated by comparing
prediction results for various test cases on different operating
systems, where we incrementally changed the vPST depths
during iterative debugging cycles.
The following of this paper is organized as follows. In
section 2, we briefly discuss related work, including some
proposed WSN debugging systems. In section 3, we describe
details on our vPST model and our iterative vPST-SVM
anomaly detecting approach. In section 4, we describe our
system design and implementation. In section 5, we present
three interesting test cases for system evaluation. Section 6
concludes this paper with some discussions.
II. RELATED WORK
In the past decade, many different WSN debugging systems
have been proposed [8], [9], [10]. However, some of them
were not easily portable due to strong dependency on specific
operating systems [10], [8]. Some others were restricted to
source code analysis or simulation trace analysis [8], [9].
There is still a significant lack of efficient debugging systems
that can fully take advantage of runtime traces from real
deployments. In fact, there are many tricky bugs caused by
race conditions or inappropriately controlled concurrencies
that can only be triggered in real deployment under some
specific circumstances.
Indeed, there have been some efforts on developing trace
based WSN debugging systems by data mining techniques.
Dustminer [11] is based on frequent pattern mining. There are
three significant drawbacks of it. First of all, it is based on
and limited to frequent patterns mining. So Dustminer will
fail to detect the bugs that only generate infrequent patterns.
Secondly, it requires a lot of human effort to figure out clear
978-1-4673-2433-5/12/$31.00 ©2012 IEEE 463