DSMS Scheduling Regarding Complex QoS Metrics Mohammad Ghalambor Department of Computer Engineering, Iran University of Science and Technology Tehran, Iran mghalambor@iust.ac.ir Ali A. Safaeei Department of Computer Engineering, Iran University of Science and Technology Tehran, Iran safaiee@iust.ac.ir Mohammad Abdollahi Azgomi Department of Computer Engineering, Iran University of Science and Technology Tehran, Iran azgomi@iust.ac.ir AbstractIn Data Stream Management Systems (DSMSs), data do not appear in the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying streams. Achieving a good perform- ance in these systems is still the main challenge. Minimizing run-time memory usage and response time are the most important performance issues. Choosing a better scheduling algorithm implies a better performance. Variety of schedulers and customized Quality of Service (QoS) metrics (related to the amount of user's satisfac- tion), motivated us to find an approach for choosing the best scheduler per case. In this paper, a new static periodic scheduler called Meta-Scheduler is proposed. Our concentration is on non-real-time DSMSs dealing with semi-regular streams (not too bursty) using complex QoS metrics. We have used coloured Petri net models to choose the best scheduling algorithm for each period regarding complex QoS metrics and varying system statistics. We have studied our scheduler in the context of our new DSMS prototype. Finally, we showed that how Meta-Scheduler outperforms simple schedulers when a user defines a complex QoS metric. I. INTRODUCTION Some modern applications need to process data streams that arrive in a continuous unbounded manner. Examples of such applications include financial applications, network monitoring, security, telecommunications data management, web applications, manufacturing, sensor networks, etc [1]. A data stream is a real-time, continuous, rapid, possibly unpredictable, infinite, time varying and ordered (implicitly by arrival time or explicitly by timestamp) sequence of tuples [2]. Relevant applications have new data management requirements that arise from the nature of data streams. Conventional DBMSs are unable to fulfill these requirements. A new class of systems that satisfy the requirements of stream-based applications is called data stream management system (DSMS). In DSMSs, queries are usually continuous and predefined. These queries are converted to query plans. Query plans are made of many operators that are related to each other by some intermediate queues. In addition, synopses are used to save sketches for some kinds of operators (e.g. binary join operators). Incoming tuples come to leaves of query plans and travel through them. Finally, the resulting tuples are being generated at the end of query plans (i.e. the roots). Operator scheduling in data stream query processing means how to assign processor to these many operators. Different scheduling methods result in different performance achievements (e.g. response time and memory usage). Choosing a proper scheduler is not easy and there is no obvious preferred metric. There are some well-known attributes for each scheduler (e.g. FIFO is the best to get the least response time and greedy is suitable for minimum memory usage) [3]. In this paper, we present that these attributes are not general and the best scheduling for a system, should be chosen considering user-defined QoS metrics, query plan and some statistics (e.g. the rate of each input stream). The remainder of this paper is organized as follows. We start with a description of complex QoS metrics in Section 2. In Section 3, we present the new scheduling model. Next, in Section 4, we show the results of experimental evaluations. Section 5 gives a brief background of scheduling in DSMSs and an overview of the related work. We conclude with remarks on future work in Section 6. II. COMPLEX QOS Most important performance metrics for a DSMS include Average Response Time (ART), average slowdown and memory usage. Although preceding researches have presented which scheduler is better regarding simple performance metrics. It has not been possible so far to choose the best scheduler regarding complex QoS metrics. We briefly define three essential performance metrics for DSMSs: 978-1-4244-3806-8/09/$25.00 © 2009 IEEE 587 Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on June 7, 2009 at 08:31 from IEEE Xplore. Restrictions apply.