Performance Evaluation of Multidimensional Array Storage Techniques in Databases Norbert Widmann , Peter Baumann Bavarian Research Centre for Knowledge-Based Systems (FORWISS) Orleansstr. 34, D-81667 Munich, Germany widmann, baumann @forwiss.de Abstract Storing multidimensional data in databases is an impor- tant topic both in the industrial and scientific database com- munities. Arrays are offered as a multidimensional data structure by most programming languages. Conventional database systems, however, do not support arrays of arbi- trary dimensionality and base type. RasDaMan is a DBMS integrating arrays as a first class data type offering both a declarative query language and a specialised storage struc- ture for arrays. The work presented here evaluates the performance of queries on multidimensional array data stored in Ras- DaMan versus storage in a conventional RDBMS. In the relational system, the data is both mapped to relations and stored directly as binary data in BLOBs. The queries exe- cuted were modelled after queries common in scientific ap- plications and decision support. 1. Introduction Multidimensional data is getting more and more relevant in the database community. On the design level, multidi- mensional data models have been developed [1] enabling the user to present problems from his application domain di- rectly in their multidimensional form. The storage of data, however, is commonly done in RDBMSs mapping the mul- tidimensional model to relations. Multidimensional arrays are the data type for process- ing multidimensional data in most programming languages. Support for multidimensional arrays in a DBMS enables users to store data from their application area in a database without mapping it to another data model. The RasDaMan DBMS supports array storage with a declarative query language and specialised storage structures. The query sponsored by the European Commission in the ESPRIT Domain 4: Long-Term Research under grant no. 20073. language offers a number of operations on arrays. Ras- DaMan was developed at the Bavarian Research Centre for Knowledge-Based Systems (FORWISS) during the ES- PRIT Long-Term Research project of the same name. Performance is a key factor when dealing with array data, as typically large amounts of data are stored and pro- cessed. Online archives for remote sensing data are planned with a size of terabytes; data warehouses used as basis for OLAP applications storing hundreds of gigabytes are not uncommon. RasDaMan, as a specialised DBMS, offers an optimising query language and specialised storage tech- niques specifically geared towards requirements set by ap- plications dealing with multidimensional data [3]. This paper compares the performance of a specialised DBMS with a commercial relational system. Section 2 in- troduces the RasDaMan DBMS. The storage alternatives for arrays in RDBMSs are discussed in Section 3. The queries and data used for performance evaluation are explained in Section 4 followed by a discussion of the results in Section 5. Section 6 presents our conclusions. 2. The RasDaMan System The RasDaMan system offers full DBMS support for multidimensional arrays. It is a client/server system pro- viding the declarative query language RasQL and a C ++ API called RasLib for users to implement their applications. The logical view on arrays is independent from their physical storage, which is done in tiles [2]. The system is imple- mented on top of an ODBMS. Query processing, optimisa- tion, and operation execution is all done completely in the RasDaMan server, while the base DBMS is used only for storage. The following gives a short overview of the RasQL query language. For a more complete treatment please re- fer to publications focusing on implementation of the Ras- DaMan system [4]. The RasDaMan Query Language, RasQL, extends SQL- 92 with operations on multidimensional arrays or parts