1 A Practical Approach for Microscopy Imaging Data Management (MIDM) in Neuroscience Shenglan Zhang 1 , Xufei Qian 2 , Amarnath Gupta 2 , Maryann E. Martone 1,2 1 National Center for Microscopy and Imaging Research, Center for Research in Biological Structural and Dept. of Neuroscience, University of California, San Diego, La Jolla, CA 92093-0608 and 2 San Diego Super Computer Center, University of California, San Diego, CA 92093-0505, USA {szhang, maryann}@ncmir.ucsd.edu, {xqian,gupta}@sdsc.edu Abstract Current data management approaches can easily handle the relatively simple requirements for molecular biology research but not the more varied and sophisticated microscopy imaging data in neuroscience research. We developed a project-oriented experimental imaging data management system through integration of the object-relational Oracle DBMS and a distributed file management system, the storage resource broker (SRB) . The data model we developed on Oracle9i supports semantic and analytical queries and image content mining. The MIDM provides comprehensive descriptive, structural, spatial and administrative information on microscopy image datasets. The current MIDM is web accessible at http://ncmir.ucsd.edu/CCDB. This paper describes the MIDM architecture and data mode in MIDM. 1. Introduction Over 300 databases are now available in the field of molecular biology [1]. Current data management approaches include structured flat files or XML-based methods, relational Database Management Systems, object-relational DBMSs and object-oriented DBMSs [2]. Database resources for scientists engaged in research at the cellular and tissue levels using microscopy imaging are scarce. Although some on-line biology image databases have tried to facilitate image exchange and management, e.g., the QBIC [3], BioImage [4] and PSLID systems [5], most systems do not extract and model the content of the image produced by scientific instruments. Maintaining and managing all of the rich image and image metadata acquired by light and electron microscopy techniques can not be accomplished by any of the current management systems. The design and implementation of image management systems for neuroscience data faces several scientific and technological challenges including: maintaining large image sizes (typically 3-10GB) and a variety of image types on different storage location; performing semantic queries with obscure scientific nomenclature and heterogeneity; performing analytical queries on tree structured neuron object obtained by different scientific instruments, different preparation methods and multiple microscopy image processing steps; performing spatial queries on multi- resolution and multi-scale microscopy images. This paper describes a microscopy imaging data management system (MIDM) and an object-relational data model which address data grid, data federation, image content retrieval and data lineage issues for managing 2D and 3D microscopic imaging data. 2. Architecture of MIDM The MIDM was designed to store 2D and 3D light and electron microscopy images, reconstructed image, image analysis and image descriptors, image related experimental data. The microscopy data resources contain heterogeneous multimedia information. The potential multimedia information in the MIDM includes: (i) Still images: Individual micrographs or derived 2D data that are encoded in standard formats (e.g. JPEG). These images may form an orderly sequence related to one another through one or more parameter, e.g., tilt angle, time; (ii) Mixed multimedia data: Compressed image files and parameter files that are bundled together as one 3D volume; (iii) Animations: A sequence of images (e.g. MPEG) that were taken at different tilt angles to illustrate a reconstructed 3D cell structure; (iv) Graphics: Drawings or illustrations that are encoded using some descriptive standards (e.g. PICT); (v) Spread sheets: Formatted cell structure analysis files from a stored data set (e.g. ASCII). The above different types of image and analysis files may be stored on distributed archival resources. Storage resource broker (SRB) as a middleware is able to manage our MIDM different formats of heterogeneous data files distributed on different types of storage devices over the network. SRB provides access to data stored on distributed archival