Developing a DataBlade for a New Index Rasa Bliuj¯ ut˙ e Simonas ˇ Saltenis Giedrius Slivinskas Christian S. Jensen Department of Computer Science, Aalborg University Fredrik Bajers Vej 7E, 9220 Aalborg, Denmark rasa, simas, giedrius, csj @cs.auc.dk Abstract In order to better support current and new applications, the major DBMS vendors are stepping beyond uninterpreted binary large objects, termed BLOBs, and are beginning to offer extensibility features that allow external developers to extend the DBMS with, e.g., their own data types and ac- companying access methods. Existing solutions include DB2 extenders, Informix DataBlades, and Oracle cartridges. Ex- tensible systems offer new and exciting opportunities for re- searchers and third-party developers alike. This paper re- ports on an implementation of an Informix DataBlade for the GR-tree, a new R-tree based index. This effort represents a stress test of the perhaps currently most extensible DBMS, in that the new DataBlade aims to achieve better performance, not just to add functionality. The paper provides guidelines for how to create an access method DataBlade, describes the sometimes surprising challenges that must be negotiated during DataBlade development, and evaluates the extensi- bility of the Informix Dynamic Server. 1. Introduction Advanced applications continuously emerge that pose new requirements to database management systems, includ- ing the need for efficient handling of the complex types of data inherent to geographical, multimedia, medical, and other advanced applications. Such data include images, videos, documents, as well as data with temporal and spatio- temporal references. Most relational DBMSs provide binary large objects, which may be used for storing such data, but this is generally not satisfactory because the internal struc- ture of data is invisible to the DBMS, which then cannot provide efficient access to the data. Support for new data types can also be introduced at the application level. But this does not provide efficient access, and it is also not eco- nomic for the many applications that need similar support to reimplement similar ad-hoc solutions. New complex data types, including efficient querying ca- pabilities on them, should be supported by the DBMS. Be- cause new applications will continue to appear that require support for new kinds of data, the DBMS should be exten- sible, allowing the users themselves to extend the DBMS’s functionality. This alleviates the vendors from attempting to keep up with the demands for new data types, and it allows users to obtain support for very specific kinds of data, for which there is only a very small market; the vendors have little incentive to develop support for such data. Indeed, over the last couple of years, major DBMS ven- dors have come up with new technology that allows the users themselves to extend the DBMS’s functionality. Examples include DB2 extenders, Informix DataBlades, and Oracle cartridges. Extenders, DataBlades, and cartridges can be de- veloped separately and plugged into the appropriate DBMS. This technology allows application developers to add new functionality to a DBMS according to their concrete needs, as well as gives third-party vendors an opportunity to make products targeting a specific application area. In ad- dition, extensible database technology reduces the gap be- tween real products and new techniques proposed by the research community, because these techniques can be inte- grated into DBMSs more easily. This facilitates dissemi- nation of research results and the transition from research results to products. The paper describes a prototype implementation of a new access method, termed the GR-tree [4], as an Informix Data- Blade. Based on the R -tree [3] (an improved version of the R-tree originally proposed by Guttman [7]), this tree in- dexes now-relative bitemporal data, which is data with as- sociated valid-time and transaction-time values [14]. Many real-world databases contain a significant portion of this type of data. The paper reports the experiences gained from develop- ing the DataBlade. It provides systematic guidelines for how to create an access method DataBlade, while also pointing out issues—expected as well as unexpected—that proved to be particularly challenging when building the DataBlade. In- formix was chosen because it provides the possibility to add advanced user-defined data types as well as user-defined ac- cess methods for these new data types. The paper covers is- ©1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.