1 ODMQL: OBJECT DATA MINING QUERY LANGUAGE MOHAMED G. ELFEKY * mgelfeky@cs.purdue.edu AMANI A. SAAD amani@alex.eun.eg SOUHEIR A. FOUAD souheir@alex.eun.eg Computer Science and Automatic Control Department Faculty of Engineering, Alexandria University * is currently a Ph.D. student in the Department of Computer Sciences, Purdue University. 1398 Computer Science Building, West Lafayette, IN 47907-1398. Phone: (765) 494-6020. ABSTRACT Data mining is the discovery of knowledge and useful information from the large amounts of data stored in databases. The emerging data mining tools and systems lead to the demand of a powerful data mining query language. The concepts of such a language for relational databases are discussed in [11]. With the increasing popularity of object-oriented databases, it is important to design a data mining query language for such databases. The main objective of this paper is to propose an Object Data Mining Query Language (ODMQL) for object-oriented databases as an extension to the Object Query Language (OQL) proposed in [6] by the Object Data Management Group (ODMG) as a standard query language for object-oriented databases. The proposed language is implemented as a feature of the ALEX Object-Oriented Database Management System [8, 16] which is a continuous project serving re- search areas related to Object-Oriented Databases. Keywords: Knowledge Discovery, Data Mining in Object-Oriented Databases, Data Mining Query Language. 1 INTRODUCTION Data Mining means the discovery of knowledge and useful information from the large amounts of data stored in databases [12]. There is a lot of research that has been conducted on data mining in relational databases to mine a specific kind of knowledge such as [3, 10, 15, 17, 18, 19]. Also, there are some data mining experimental systems that have been developed for re- lational databases, such as DBMiner [9], Explora [14], MineSet [5], Quest [2], etc. The objective of such systems is to mine different kinds of knowledge by offering isolated discovery features. Such systems cannot be embedded into a large application and typically offer just one knowledge discovery feature [13]. The ad hoc nature of knowledge discovery queries in large applications needs an efficient query language much more general than SQL, and this query language is called Data Mining Query Language (DMQL). An example of such language in relational data- bases is found in [11]. Although the wide variety of advanced database systems relying deeply on the object- oriented data model, there is no data mining query language for object-oriented databases. This motivates us to propose such a language in this paper. The rest of this paper is organized as follows. Section 2 describes the design principles of ob- ject data mining query languages. Section 3 introduces the main features of the proposed lan- guage. A formal and complete definition of the language is given in Section 4. Section 5 gives a large number of examples serving a wide variety of data mining requests. Section 6 discusses briefly the implementation of the language. Section 7 summarizes the paper and presents differ- ent work to be done in the future.