Info rm atio n Sy ste m s Vol. 14, No. 3, pp. 247-259, 1989 0306-4379/89 $3.00 + 0.00 Printed in Great Britain. All rights reserved Copyright 0 1989 MaxwellPergamon Macmillanplc AN AUTOMATED TOOL FOR RELATIONAL DATABASE DESIGN SUDHA RAM and STEPHENM. CURRAN Department of Management Information Systems, College of Business and Public Administration, University of Arizona, Tucson, AZ 85721, U.S.A. zyxwvutsrqponmlkjihgfedcbaZY (Received 11 August 1988; in revised form 11 February 1989) Abstract-This paper addresses relational database design using the concept of functional dependencies (FDs). The classical synthesis approach processes a given set of functional dependencies to produce one minimal cover. This cover is then used to develop a relational schema; however, a given set of FDs may have more than one minimal cover. In turn, different minimal covers may give rise to different relational schemata. An enhancement is proposed to the traditional synthesis algorithm that aids in efficiently determining all minimal covers for a given set of functional dependencies. We have implemented a tool called SYNTHESIZER that uses the modified synthesis algorithm to produce relations in Third Normal Form. SYNTHESIZER not only preserves dependencies, but also enforces the lossless join property. SYNTHESIZER has been implemented in Turbo Pascal to operate on IBM-PC compatibles. Expert design heuristics have been incorporated into this tool. The tool supports the requirements collection, conceptual and logical design phases of database design. SYNTHESIZER has been extensively validated by designers in more than 10 different database design projects. 1. INTRODUCTION Since their emergence in the early 1970s database management systems (DBMSs) have become an integral part of most business corporations. The contribution of DBMS technology has grown to the extent that most businesses regard a corporate database as a major intangible asset. Yet, despite more than a decade of general use, database design remains more of an art than a science. At present, no single methodology dominates the design process. Consequently automated tools based on these divergent methodologies also vary in their approach. Some can be used by a designer who has little knowledge of database systems, while others require a fair amount of expertise with database systems and design techniques. Typically, these tools focus on one phase of database design, but, to be of any real use, these tools should provide integrated support for all phases of the design process. This paper describes a new automated relational database design tool called SYNTHESIZER. Before examining SYNTHESIZER, we present a brief re- view of database design and current database design aids in Section 2. Section 3 describes the modified synthesis algorithm that forms the basis for SYN- THESIZER. Section 4 presents SYNTHESIZER and section 5 concludes by describing directions for future extensions. 2. DATABASE DESIGN APPROACHES 2.1. Overview of the database design process Database design methodology can be divided into four phases. Each of these phases is described as follows: 247 l Database planning and requirements analysis: Database planning is the process of identifying current and future information requirements for an organization. The goal of this step is to collect and analyze the heterogeneous descriptions used to express requirements and to transform them into a homogeneous description of the applica- tions. This description should be independent of the eventual implementation system because user requirements determine both the static (data) and dynamic (operations, events) aspects of an appli- cation. The various input and output documents are automatically identified as this process is carried out. Required entities, data stores and uses are defined during this phase. l Conceptual design: The conceptual database can be considered as a model of the data required, that is independent of the individual applica- tions and the software and hardware systems anticipated being used. The conceptual database should reflect the inherent structure of the infor- mation model being utilized. In this phase the entities, data stores and data items identified in the previous phase are integrated to reflect their interrelationships. One of the most popular graphical tools used for this purpose is the Entity Relationship Model (ERM) [l, 21. This model helps in analyzing and representing the associa- tions between entities and the data items associ- ated with each entity. Another model that is very popular because of its ability to portray the meaning of the database-the Semantic Data Model-is also very useful at this stage [3]. Bachman’s Data Structure Diagram is another