Large-Scale QoS-Aware Service-Oriented Networking with a Clustering-Based Approach Jingwen Jin Communications Technology Lab Intel Corporation Email: jingwen.jin@intel.com Jin Liang, Jingyi Jin, Klara Nahrstedt Dept. of Computer Science University of Illinois at Urbana-Champaign Email: {jinliang, jingjin, klara}@uiuc.edu Abstract— Motivated by the fact that most of the existing QoS service composition solutions have limited scalability, we develop a hierarchical-based solution framework to achieve scalability by means of topology abstraction and routing state aggregation. The paper presents and solves several unique challenges associated with the hierarchical-based QoS service composition solution in overlay networks, including topology formation (cluster detection and dynamic reclustering), QoS and service state aggregation and distribution, and QoS service path computation in a hierarchically structured network topology. In our framework, we (1) cluster network nodes based on their Internet distances and maintain clustering optimality at low cost by means of local reclustering operations when dealing with dy- namic membership; (2) use data clustering and Bloom filter tech- niques to jointly reduce complexity of data representation associ- ated with services within a cluster; and (3) investigate a top-down approach for computing QoS service paths in a hierarchical topol- ogy. keywords: QoS service composition, hierarchical networking, network clustering, topology/data aggregation I. I NTRODUCTION Recent years have witnessed a software engineering paradigm shift from the object-based component software de- velopment style to the service-oriented component software de- velopment style to achieve looser coupling, thus greater inter- operability among heterogeneous software components. Future applications can then be seen as compositions of component services that are possibly distributed widely. Deployment of component-service-based applications in the wide area actually spawns a new routing problem, which we call QoS service-added routing. Given a service component network and a set of requirements on component services and path QoS (e.g., total path length, end-to-end available band- width) for the composed application, the QoS service-added routing problem is to find QoS-satisfied service paths that sat- isfy functional correctness and QoS requirements at the same time [1]. Component-based services can have numerous applications, one being providing value-added on-the-fly transformation ser- vices to applications that require content and protocol transfor- mations according to end-to-end needs. As an example, we dis- cuss a video streaming application that demands QoS treatment in this paper. Let us assume a video server that serves only high-resolution MPEG format videos. Due to many reasons, a remote user may want to retrieve a video with additional pro- cessing requirements. In addition to requiring the streaming video to meet special QoS levels, he/she may specify for (1) the video quality to be scaled down; (2) the video content to be transcoded into a different format; (3) the human speech to be translated into another language;(4) proper caption to be added; (5) background music to be included; and (6) video color to be adjusted. For scalability, we argue that such content transfor- mational tasks (the component services) need to be carried out in a network rather than at end hosts. We call such a network a service network in this work. Given the assumption, we need special support from the mid- dleware layer that seamlessly composes services at required QoS-level, so that the distributed services infrastructure stays transparent to the applications. While the QoS service-added routing (or composition) problem has been studied quite exten- sively in the recent years, most of the existing solutions perform routing in flat topologies based on centralized planning. Those schemes do not scale because routing information maintenance overhead grows quickly with the size of the network. This paper targets to provide a scalable solution by devising a clustering- based hierarchical QoS service-added routing framework. By introducing hierarchies into the routing plane, topology abstrac- tion and state information aggregation become viable so that routing information maintenance overhead can be significantly reduced. Hierarchical routing is not a new concept; hierarchical QoS (data) routing in ATM networks can be found in [2], [3], [4]. However, at the overlay layer, QoS service-added routing presents a number of unique challenges that demand special ef- forts. 1) identification of clusters at the application layer and management of dynamic membership of network nodes - In a physical network, clusters are defined manually by humans based on properties of the network such as loca- tion, administrative domain, or connectivity. However, an overlay network has a virtual, fully-connected topology as long as the underlying physical network does not parti- tion. To manually define clusters and configure topology of a large overlay network is infeasible. Therefore, we need a solution that automatically (1) detects clusters that conform to the underlying physical topology, (2) defines proper links among the nodes as well as roles for nodes in a hierarchical topology, and (3) copes with dynamic membership of network nodes, including allowing nodes to join/leave the system and performing reclustering op-