A Rule Engine for Query Transformation in
Starburst and IBM DB2 C/S DBMS
Hamid Pirahesh, T.Y. CliffLeung, Waqar Hasan
IBM Almaden Research Center
{pirahesh,cleung} @almaden.ibm.corn, hasan@informix.com
Abstract
The complexi~ of queries in relational DBMSs is increas-
ing particularly in the decision support area and interactive
client server environments. This calls for a more power-
ful and flexible optimization of complex queries. In [19]
we introduced query rewrite as a distinct query optimiza-
tion phase mainly targeted to responding to this require-
ment. This approach has enabled us to extensively enrich
the optimization rides in our system. Further, it has made it
easier to incrementally enrich and adapt the system as need
arises. Examples of such query optimizations are predicate
pushdown, subquery and magic sets transformations, and
decorrelating subqueries. In this paper we describe the de-
sign and implementation of a rule engine for query rewrite
optimization. Each transformation is implemented as a rule
which consists of a pair of rule condition and action [19].
Rules can be grouped into rule classes for higher efficiency,
better understandability and more extensibility. The rule
engine has a number of novelties in that it supports a full
spectrum of control --from totally data-driven to totally
procedural. Furthermore, it incorporates a budget control
scheme for controlling the resources taken for query opti-
mization as well as guaranteeing the termination of rule ex-
ecution. The rule engine and a suite of query rewrite rules
have been implemented in Starburst relational DBMS pro-
totype and a significant portion of this technology has been
integrated into IBM DB2 Common Server relational DBMS.
1 Introduction
There is an increasing demand on efficient handling of com-
plex queries in relational DBMS. Such demands have in-
creased rapidly in recent years due to advancement of GUI
in query managers. Often, such GUIs automatically gen-
erate SQL queries, enabling users to pose very complex
queries. Such machine generated queries are not hand-
optimized by programmers, hence demanding better query
optimization within the DBMS.
In Starburst [12] 1, we have implemented a distinct opti-
mization phase, called query rewrite, in direct response to
this problem. We have chosen a rule based system for query
rewrite optimization. Such a system allows us to keep the
repertoire of optimization methods open-ended, consider-
ably reducing the effort it usually takes to enrich the opti-
mization repertoire as the user requirements evolve. Provid-
ing such a flexibility must not be at the expense of optimiza-
tion time. Often in interactive environments, the queries are
ad-hoc, requiring on-the-fly query optimization. This de-
mands a high performance rule system since the time spent
in query rewrite rule inferencing is part of the query re-
sponse time. The growing list of rewrite rules implemented
in this system includes predicate pushdown, subquery to
join transformation, magic sets transformation, handling of
duplicates, merging of views, and decorrelating complex
subqueries [16, 17, 15, 19, 22, 18, 4, 23]. These transforma-
tions often lead to orders of magnitude improvement in per-
formance, making query rewrite an important component of
our system.
In this paper we describe the design and implementation
of a rule engine for query rewrite optimization in Starburst.
Each transformation is implemented as a rule which con-
sists of a pair of rule condition and action [ 19]. Rules can
be grouped into rule classes. We have opted for building
a new rule engine so that (1) it is fast enough for interac-
tive queries; (2) it can handle the complexity of our rewrite
rules; (3) it can suitably be made an integrated component
of a relational DBMS architecture.
In our design, we chose a rule based architecture because
it allows us to achieve an extensible design [3, 7, 2, 8, 11,
9, 1, 5, 10]. For query rewrite, the important advantages for
using a rule based system are modifiability (and thus exten-
sibility) and explanation. For example, it is important to be
able to add a new query rewrite rule into the system without
modifying the existing rules, to have an explanation of how
the rule system arrives at a solution for a given query, and
to influence the way the rule engine does its inferencing for
1In the rest of the paper,a reference to Starburst is implicitly a reference
to IBM DB2 CommonServerVersion2 DBMSunlessotherwisenoted.
1063-6382/97 $10.00 © 1997 IEEE
391