Adaptive Evaluation Techniques for Querying XML-based E-Catalogs Georg Lausen and Pedro Jos´ e Marr´ on Universit¨ at Freiburg, Institut f¨ ur Informatik Georges-Koehler-Allee, Geb. 51, 79110 Freiburg, Germany lausen,pjmarron @informatik.uni-freiburg.de Abstract The integration of electronic catalogs (eCatalogs) is one of the most important aspects for the successful deployment of electronic commerce systems, since they are usually the only communication channel between buyers and suppliers. In this paper, we propose an XML-based global eCatalog integration platform whose query model allows us to avoid the costly problem of finding rewritings for each local eCat- alog in the system, while at the same time, providing reli- able answers to a wide range of XPath queries. Our imple- mentation relies on the following characteristics to achieve its goals: the intrinsic properties of the XPath model; the applicability and efficiency of an extensible fitness function used to evaluate each answer; and the hierarchical nature of product catalogs. Keywords: E-Commerce, E-Catalog, XML, XPath, LDAP. 1 Introduction Electronic Catalogs (eCatalogs) are a crucial compo- nent for Electronic Commerce (eCommerce) on the Inter- net, since they list, describe and, more importantly, cate- gorize the kinds of products and services suppliers offer to their customers. ECatalogs are, therefore, the main com- munication channel between buyers and suppliers, only sur- passed in importance by the proper and correct administra- tion of diverse groups of catalogs from different suppliers, whose transparent integration constitutes one of the key as- pects of the successful deployment of electronic commerce nowadays. Various forms of catalog integration have been proposed in the literature [4, 1, 7] and some of them are available as commercial products, but whether they propose a central- ized integrated platform as their solution, or merely a means to establish a direct channel between buyers and suppliers, the importance of projecting a uniform, integrated image to the potential customers is crucial for electronic commerce [6]. In this paper, we propose an eCatalog integration plat- form that can be considered similar in spirit to the local- as-view schema, where queries are formulated following a predefined global catalog (gCatalog) and forwarded to the appropriate local catalogs for evaluation. As opposed to the classical view approach to catalog integration [3], our methodology does not rely on finding rewritings to per- form individual queries on local catalogs, but in an adaptive query evaluation strategy that allows us to perform the same query on all local catalogs and still obtain reliable answers. The gist of our approach relies on three main factors: 1. The intrinsic properties of XPath queries over XML- based catalogs; 2. the flexibility of a fitness function that allows us to dis- criminate more accurate solutions from others; and 3. the nature of the conceptual hierarchy on which prod- uct catalogs are based. The combination of these factors allows us to perform the evaluation of XPath queries on top of XML-based cata- logs in a very efficient way. The rest of this paper is organized as follows. Section 2 briefly describes the XPath and XML models and intro- duces the running example we will use throughout this pa- per. Section 3 explains the details of our adaptive query evaluation model and how it can be used to provide catalog integration capabilities, leaving the discussion of the advan- tages and limitations of our approach, as well as some ap- propriate possible solutions for such disadvantages for sec- tion 4. Finally, section 5 concludes this paper. 2 XML, XPath and eCatalogs Consider the scenario, originally described in [8], where a group of electronics companies decide to offer their prod- 1