Cache-aware load balancing vs. cooperative caching for distributed search engines David Dominguez-Sal Computer Architecture Dept. DAMA-UPC Barcelona, Spain ddomings@ac.upc.edu Marta Perez-Casany Applied Mathematics II Dept. DAMA-UPC Barcelona, Spain marta.perez@upc.edu Josep Lluis Larriba-Pey Computer Architecture Dept. DAMA-UPC Barcelona, Spain larri@ac.upc.edu Abstract In this paper we study the performance of a distributed search engine from a data caching point of view. We compare and combine two different approaches to achieve better hit rates: (a) send the queries to the node which currently has the related data in its local memory (cache-aware load balancing), and (b) send the cached contents to the node where a query is being currently processed (cooperative caching). Furthermore, we study the best scheduling points in the query computation in which they can be reassigned to another node, and how this reassignation should be per- formed. Our analysis is guided by statistical tools on a real question answering system for several query distributions, which are typically found in query logs. 1. Introduction The construction of distributed search engines is a com- plex task where many components with high computational cost interact. New search engines combine additional mod- ules to reﬁne an answer and achieve a better precision. However, these more advanced features come with large computational costs that must be addressed to make systems scalable. We take Question Answering (QA) as an example of these next generation search engines. QA systems return short, precise answers, e.g., person and location names in response to natural language questions [1]. For example, a QA system that receives as input the question “In which city is the Eiffel Tower?” will answer “Paris”. Caching and distributed systems are two fundamental pillars required to improve the ﬁnal performance of these systems. In this paper, we study how these two factors interact and how they impact the performance of a fully ﬂedged search engine. A search engine receives many queries with overlapping computation: queries may share terms, they may access The authors want to thank Generalitat de Catalunya for its support through grant number GRE-00352 and Ministerio de Educaci´ on y Ciencia of Spain for its support through grant TIN2006-15536-C02-02, and the European Union for its support through the Semedia project (FP6-045032). similar document sets or completely different queries may be looking for the same answer [2]. In these common scenarios, caches are crucial because they store these partial results in the main memory, thus saving execution time for subsequent queries. In this paper, we implement a cooperative cache that enables all the computers in the system to introduce and retrieve data from the system transparently, similar to a regular local cache. The cache is managed in accordance to the recent accesses to data in each node of the network. The system records the dataset that is most frequently accessed locally and disseminates a summary of this dataset to the rest of nodes. This information is updated dynamically and is used by all the nodes in the network to decide which is the best node to place a document, and to control the number of replicas of a document in the distributed system. Moreover, we implement a load balancing algorithm that is cache-aware. The objective of a load balancing algorithm is to assign the workload following a policy that optimizes the overall system performance. The load balancing con- siders not only the current load in each node but also the expected real execution time of the query, given the state of the global cache. Thus, a query may not be assigned to the most idle node but to a node which has the query partial results cached, which in the end yields a faster query execution. Load balancing and cooperative caching are two useful techniques to improve the throughput of a system. However, to our knowledge there is no previous work that studies the interaction of cache aware load balancing algorithms with cooperative cache algorithms. In this paper, we combine these two techniques and analyze the interaction between them. Both techniques implement different facets of a global management scheme that improves the data locality for the executed queries. On the one hand, cooperative caching sends the information where the queries are being currently processed. On the other hand, the load balancer applies an alternative policy: it sends the queries to the nodes that currently have the data stored. We study with statistical tools wether any of the two approaches alone is sufﬁcient or if they can be successfully combined. Furthermore, we use a similar approach to understand the system conﬁguration 2009 11th IEEE International Conference on High Performance Computing and Communications 978-0-7695-3738-2/09 $25.00 © 2009 IEEE DOI 10.1109/HPCC.2009.31 415 Authorized licensed use limited to: UNIVERSITAT POLITÈCNICA DE CATALUNYA. Downloaded on March 01,2010 at 03:28:32 EST from IEEE Xplore. Restrictions apply.