Database Challenges and Solutions in Neuroscientific Applications Ali E. Dashti, Shahram Ghandeharizadeh, James Stone, Larry W. Swanson, and Richard H. Thompson Departments of Biological Science and Computer Science, Program in Neural, Informational, and Behavioral Sciences, USC Brain Project, University of Southern California, Los Angeles, California 90089-2520 Received May 22, 1996 In the scientific community, the quality and progress of various endeavors depend in part on the ability of researchers to share and exchange large quantities of heterogeneous data with one another efficiently. This requires controlled sharing and exchange of informa- tion among autonomous, distributed, and heteroge- neous databases. In this paper, we focus on a neurosci- ence application, Neuroanatomical Rat Brain Viewer (NeuART Viewer) to demonstrate alternative database concepts that allow neuroscientists to manage and exchange data. Requirements for the NeuART applica- tion, in combination with an underlying network- aware database, are described at a conceptual level. Emphasis is placed on functionality from the user’s perspective and on requirements that the database must fulfill. The most important functionality required by neuroscientists is the ability to construct brain models using information from different repositories. To accomplish such a task, users need to browse re- mote and local sources and summaries of data and capture relevant information to be used in building and extending the brain models. Other functionalities are also required, including posing queries related to brain models, augmenting and customizing brain mod- els, and sharing brain models in a collaborative envi- ronment. An extensible object-oriented data model is presented to capture the many data types expected in this application. After presenting conceptual level de- sign issues, we describe several known database solu- tions that support these requirements and discuss requirements that demand further research. Data inte- gration for heterogeneous databases is discussed in terms of reducing or eliminating semantic heterogene- ity when translations are made from one system to another. Performance enhancement mechanisms such as materialized views and spatial indexing for three- dimensional objects are explained and evaluated in the context of browsing, incorporating, and sharing. Policies for providing the system with fault tolerance and avoiding possible intellectual property abuses are presented. Finally, two existing systems are evaluated and compared using the identified requirements. r 1997 Academic Press INTRODUCTION Research on the design, development, management, and use of database systems has traditionally focused on business-like applications. However, concepts devel- oped for such applications fail to support the diverse needs of scientific applications. Next generation data- base applications, enabled by the explosion of digitized information over the last 5 years, require the solution of significant new research problems. Scientific and other biomedical applications generate and require access to an extraordinarily large range of multimedia data formats: numbers, symbols, texts, images, and others. Moreover, the data is stored on geographically distrib- uted, autonomous, and heterogeneous systems (Wins- lett, 1992; Silberschatz et al., 1995). However, the quality and progress of scientific endeavors depends in part on the ability of researchers to share and exchange large amounts of heterogeneous data with one another efficiently. It is obvious that applications are required that allow for the controlled sharing and exchange of information among autonomous, distributed, and heter- ogeneous databases (Hammer, 1994). Addressing the desperate need for technological advances, a number of nationally funded projects, including the Human Brain Project, are being sponsored by organizations such as the National Science Foundation, the National Insti- tute of Health, and the Department of Defense. These projects share a number of common goals: (1) the integration of heterogeneous data sources into a coher- ent view, (2) efficient access to, and manipulation of, large data volumes, (3) customized access to relevant data, and (4) the ability to share in a collaborative environment. The issues discussed here have emerged from our work on the Human Brain Project initiative, which is a collaboration between neuroscience and database re- searchers to realize a digital collaborative environment that does not sacrifice research group individuality. In this paper, we focus on the alternative design issues of a specific application, a Neuroanatomical Rat Brain Viewer (NeuART Viewer), and discuss relevant con- cepts developed by the database community. For each NEUROIMAGE 5, 97–115 (1997) ARTICLE NO. NI960253 97 1053-8119/97 $25.00 Copyright r 1997 by Academic Press All rights of reproduction in any form reserved.