The Use of Semantic-based Predicates Implication to Improve Horizontal Multimedia Database Fragmentation Fekade Getahun, Solomon Atnafu Department of Computer Science, Faculty of Informatics Addis Ababa University, 1176 Addis Ababa, Ethiopia {fekadeg, satnafu}@cs.aau.edu.et Joe Tekli, Richard Chbeir LE2I-CNRS Laboratory, University of Bourgogne 21078 Dijon Cedex France {joe.tekli, richard.chbeir}@u-bourgogne.fr ABSTRACT Database fragmentation allows reducing irrelevant data accesses by grouping data frequently accessed together in dedicated segments. In this paper, we address multimedia database fragmentation to take into account the rich characteristics of multimedia objects. We particularly discuss multimedia primary horizontal fragmentation and focus on semantic-based textual predicates implication required as a pre-process in current fragmentation algorithms in order to partition multimedia data efficiently. Identifying semantic implication between similar queries (if a user searches for the images containing a car, he would probably mean auto, vehicle, van or sport-car as well) will improve the fragmentation process. Making use of the neighborhood concept in knowledge bases to identify semantic implications constitutes the core of our proposal. A prototype has been implemented to evaluate the performance of our approach. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Storage – Record Classification; Information Search and Retrieval – Search Process; H.2.7 [Database Management]: Database Administration; H.2.8 [Database Management]: Database Applications; H.2.5 [Database Management]: Heterogeneous Databases; H.2.4 [Database Management]: Systems. General Terms Algorithms, Measurement, Performance, Design, Experimentation. Keywords Multimedia Retrieval, Horizontal Fragmentation, Data Partition, Data implication 1. INTRODUCTION Multimedia applications emerging in distributed environments, such as the web, create an increasing demand on the performance of multimedia systems, requiring new data partitioning techniques to achieve high resource utilization and increased concurrency and parallelism. Several continuing studies are aimed at building distributed MultiMedia DataBase Management Systems MMDBMS [8]. Nevertheless, most existing systems lack a formal framework to adequately provide full-fledge multimedia operations. Traditionally, fragmentation techniques are used in distributed system design to reduce accesses to irrelevant data, thus enhancing system performance [4]. In essence, fragmentation consists of dividing the database objects and/or entities into fragments, on the basis of common queries accesses, in order to distribute them over several distant sites. While partitioning traditional databases has been thoroughly studied, multimedia fragmentation has not yet received strong attention. In this paper, we address primary horizontal fragmentation (cf. Section 2) in distributed multimedia databases. We particularly address semantic-based predicates implication required in current fragmentation algorithms, such as Make_Partition and Com_Min [1, 13, 14], in order to partition multimedia data efficiently. The need of such semantic-based implication is emphasized by the fact that annotations and values describing the same object, during the storage or retrieval of multimedia data, could be interpreted with largely different meanings. For example, if a user searches for the images containing a car, he would probably mean auto, vehicle, van or sport-car as well. Therefore, it is obvious that semantic implication between such similar values will improve the fragmentation process (and more particularly will impact the choice of minterms as we will see in the remaining sessions). The contribution of the paper can be summarized as follows: i) introducing algorithms for identifying semantic implications between predicate values, ii) introducing an algorithm for identifying semantic implications based on predicate operators, iii) putting forward an algorithm for identifying implications between semantic predicates on the basis of operator and value implications, iv) developing a prototype to test and validate our approach. The remainder of this paper is organized as follows. Section 2 briefly reviews the background and related work in DB fragmentation. In Section 3, we present a motivation example. Section 4 is devoted to define the concepts to be used in our approach. In Section 5, we detail our semantic implication algorithms and their usage in the multimedia fragmentation process. Section 6 briefly presents our prototype. Finally, Section 7 concludes this work and draws some ongoing research directions. 2. BACKGROUND AND RELATED WORK Fragmentation techniques for distributed DB systems aim to achieve effective resource utilization and improved performance [20]. This is addressed by removing irrelevant data accessed by applications and by reducing data exchange among sites [21]. In this section, we briefly present traditional database fragmentation approaches, and Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MS’07, September 28, 2007, Augsburg, Bavaria, Germany. Copyright 2007 ACM 978-1-59593-782-7/07/0009...$5.00.