© The British Computer Society 2014. All rights reserved.
For Permissions, please email: journals.permissions@oup.com
doi:10.1093/comjnl/bxu047
A Method to Find Functional
Dependencies Through Refutations
and Duality of Hypergraphs
Joel Fuentes
1, ∗
, Pablo Sáez
1
, Gilberto Gutiérrez
1
and Isaac D. Scherson
2
1
Department of Computer Science and Information Technologies, Universidad del Bío-Bío, Chillán, Chile
2
Department of Computer Science, University of California, Irvine, CA, USA
∗
Corresponding author: jfuentes@ubiobio.cl
One of the most important steps in obtaining a relational model from legacy systems is the extraction of
functional dependencies (FDs) through data mining techniques. Several methods have been proposed
for this purpose and most use direct search methods that traverse the search space in exponential
time in the number of attributes of the relation. As it is not uncommon to find in practice relations
with tens of attributes, a need exists to further develop more efficient techniques to find FDs. The
method studied here finds the minimal set of minimal FDs using algorithms that solve the hypergraph
duality problem applied on the complement of the refutation hypergraph of the relation without going
through the exponential search space. After showing that the extraction of FDs can be reduced to the
hypergraph duality problem, experimental results are given as verification and characterization of
the correctness and time complexity of the proposed tool.
Keywords: functional dependencies; duality of hypergraphs; minimal transversals
Received 4 December 2013; revised 1 May 2014
Handling editor: Rada Chirkova
1. INTRODUCTION
The extraction of functional dependencies (FDs from now on)
from an instance of a relation is an important data mining
technique, used in database design, query optimization and
reverse engineering among others [1]. Studies like [2] propose
methods and strategies to obtain the relational database model
from legacy systems, where one of the steps is the automatic
extraction of the FDs. This shows the importance of having
efficient tools that perform this task.
A number of tools and algorithms have been indeed proposed
for this purpose, but most of them are exponential in time in the
number of attributes of the relation. But in real situations it
is common to have relations with a high number of attributes
(for instance more than 20 or 30 attributes). This motivates
the search for more efficient techniques than those used by
these tools. We will show in the following sections that
the problem of finding FDs can be efficiently reduced, by
making use of FD refutations, to the well-known hypergraph
duality problem [3], for which for instance O(n
log n
), i.e.
quasi-polynomial algorithms are known [4]. The idea of using
hypergraph transversals for inferring FDs was independently
proposed in [5, 6] and refutations were referred to as antikeys.
Our main contributions in this paper can be summarized as
follows:
1. Implementation of an efficient computational method to
obtain the set of refutations for FDs, given an instance
r of a relation R.
2. A method to store and process these refutations,
represented as hyperedges of a hypergraph.
3. The obtention of all the minimal FDs that are valid in
an instance of a relation, by means of the computation
of the set of minimal transversals of this hypergraph.
4. A complete tool to compute the set of minimal FDs,
together with an analysis of the time spent by the tool
on the two main processes.
The present article is divided into six sections. In Section 2,
the problem is stated and related work is reviewed. In Section 3,
we briefly recall the hypergraph duality problem and the known
algorithms that solve it. In Section 4, we present the proposed
approach to the problem. And in Sections 5 and 6 some
experimental results and conclusions are given, respectively.
Section A: Computer Science Theory, Methods and Tools
The Computer Journal, 2014
The Computer Journal Advance Access published June 9, 2014
by guest on August 6, 2014 http://comjnl.oxfordjournals.org/ Downloaded from