Dynamic Shape Analysis using Spectral Graph Properties
Muhammad Zubair Malik and Sarfraz Khurshid
Department of Electrical and Computer Engineering
The University of Texas at Austin
Austin, TX 78712
Email: {zubair@mail, khurshid@ece}.utexas.edu
Abstract—Dynamically allocated data structures pervade
imperative and object-oriented programs. Automated analysis
and testing of such programs requires reasoning about their
data structures. The structures often have complex structural
properties, such as acyclicity of the object graph rooted at a
given pointer. Such properties pose a challenge for automated
reasoning. Shape analysis is a class of techniques that address
reasoning about such programs. Traditionally, shape analysis
is performed using static analysis of the program code. More
recently, dynamic techniques for shape analysis have been
developed, which inspect program states to identify properties
of data structures. This paper presents a novel dynamic
technique, which adapts well-studied results from graph theory
to determine the shape of the program’s key data structures.
Specifically, spectral graph theory, a field that studies the
properties of a graph in relation to the properties of matrices
based on the graph, e.g., eigenvalues of its adjacency matrix,
provides the foundational ideas. Experimental results using a
suite of data structures demonstrate the potential the technique
holds in identifying data structure properties and detecting
likely erroneous program states.
Keywords-Structural invariant generation; Shape analysis;
Graph spectra; Deryaft
I. I NTRODUCTION
Automated analysis and testing of programs written in
commonly used imperative and object-oriented languages
remains a challenging problem. Part of the challenge is in
automated reasoning about dynamic data structures that re-
side on the program heap and often have complex structural
properties, such as acyclicity of the object graph rooted at a
given pointer, which are hard to reason about.
Shape analysis is a class of techniques that address such
properties. Traditionally, shape analysis techniques use static
analysis of the program code to determine properties of
its data structures [13], [20], [25], [28]. A key motivation
behind the use of static analysis is to determine the prop-
erties at desired control points for all program executions,
say for program verification. More recently, dynamic tech-
niques [10], [15], which inspect program states at desired
control points to characterize data structure properties, have
been developed. While these techniques do not enable
verification for all executions, they enable detecting likely
erroneous executions at runtime and promise to be more
scalable for finding bugs than techniques based on static
analysis.
This paper presents a novel dynamic technique, which
adapts well-studied results from graph theory to determine
the shape of the program’s key data structures. We view the
object graph that represents a program heap as a mathemat-
ical object – an edge-labeled graph, where graph vertices
correspond to objects allocated on the heap and graph edges
correspond to fields of these objects [8], [9], [16]. Our tech-
nique is inspired by spectral graph theory [4] – a field that
studies the properties of a graph in relation to the properties
of matrices based on it, such as its adjacency matrix or
its Laplacian matrix. Specifically, we define properties of
recursive data structures using properties of eigenvalues of
the associated matrices as well as other graph properties,
such as in-degree of a vertex.
Our technique builds on the Deryaft framework [12],
[15], which we developed in previous work, for generating
likely representation invariants. Deryaft takes its inspira-
tion from the Daikon invariant detector [6]. In contrast
to Daikon, which is a general purpose invariant detection
engine, Deryaft focuses on structural properties and as such
generates more accurate structural invariants. We follow
the general approach introduced by Deryaft for structural
invariants: first, identify core and derived fields of a data
structure; and then, check which properties from a pre-
defined collection of properties hold for the field values for
a given set of program states. The properties that hold for a
given set of states are used in two ways: (1) to directly
check if a new program state satisfies them; and (2) to
generate a representation of the properties as an executable
Java predicate, which can be used in a number of ways, e.g.,
as a runtime assertion or to perform data structure repair [5].
A key advantage of using graph spectra over Deryaft’s
approach is that, in principal, they allow checking for
(violation of) properties that may not be pre-defined and
computed only based on the program states once they are
encountered. Thus, graph spectra not only introduce a novel
abstraction for properties of program state, but they also
enhance our ability to dynamically detect a larger class
of errors without requiring the user to provide detailed
specifications. As a first step to enable detecting properties
that are not directly characterized in spectral graph theory,
we conjecture that an invariant learning mechanism using
support vector machines [19] may provide a viable solution.
2012 IEEE Fifth International Conference on Software Testing, Verification and Validation
978-0-7695-4670-4/12 $26.00 © 2012 IEEE
DOI 10.1109/ICST.2012.33
212
2012 IEEE Fifth International Conference on Software Testing, Verification and Validation
978-0-7695-4670-4/12 $26.00 © 2012 IEEE
DOI 10.1109/ICST.2012.33
211