EGI: Implementing Service Management in a large- scale e-Infrastructure Sy Holsinger Strategy and Policy EGI.eu Amsterdam, Netherlands sy.holsinger@egi.eu Sergio Andreozzi Strategy and Policy EGI.eu Amsterdam, Netherlands sergio.andreozzi@egi.eu Abstract— The European Grid Infrastructure (EGI), a distributed computing infrastructure for research, has been continuously evolving from a project-based system into a 24/7 professional service. This transition comes with user expectations of increasingly reliable and predictable services, improvements that can only be achieved through advanced technologies coupled with mature human management processes. Many publicly funded e-Infrastructures are also facing similar challenges. This complex change involves defining new management approaches that can support long-term, reliable service delivery and develop community competences in them. EGI has been working on implementing the FitSM service management standard as the first step to better service management, offer more predictable service delivery, and efficiently use organizational resources. This paper provides the methodology used and overall experience in implementing IT Service Management in a large-scale e- Infrastructure. Keywords—e-Infrastructure; service management; distributed computing; sustainability I. INTRODUCTION The European Grid Infrastructure (EGI) [1], a distributed computing infrastructure for research, has been continuously evolving from a project-based system into a 24/7 professional service. This transition comes with user expectations of increasingly reliable and predictable services, improvements that can only be achieved through advanced technologies coupled with mature human management processes. EGI and other e-Infrastructures face similar challenges with moving to sustainable service provisioning. This complex process involves defining new management approaches that can support long-term, reliable service delivery and developing community competences in them. In the public and commercial sectors, IT Service Management techniques have been developed to make the complex process of service provision more repeatable, predictable and controllable. Examples include the international standard ISO/IEC 20000 and the ITIL best practice framework. However, it is harder to find service management that is able to cope with federated environments, such as EGI, which often lacks the hierarchy and formal agreements seen in other situations. Existing frameworks and standards can also be too comprehensive or complex to realistically apply within a federated environment or do not meet the needs of federated service provision. As a result, an approach better suited to complex federations is needed to support EGI. This need has led to the creation of the FitSM standard [2], developed by the FedSM project [3] in collaboration with EGI.eu (coordinator of EGI), and the Polish [4] and Finnish [5] National Grid Infrastructures. FitSM is much lighter than frameworks such as ITIL and is tailored for federated environments. With support from FedSM, EGI.eu (the coordination body of EGI), is working to document and define current practices into structured processes for the improvement of service delivery to its customers based on this new standard in areas of operations, policy and software. If EGI is to continuously evolve towards a sustainable service provider, management of those services will need to continuously improve as well. Better service management will directly impact sustainability in offering more predictable service delivery, more efficient use of organizational resources while reducing human errors, and provide clarity in the value researchers are receiving and funding bodies are supporting. This paper outlines the motivation for implementing federated ITSM management in EGI, the selected methodology and reports on the overall experience. II. BACKGROUND OF EGI EGI is an open ICT ecosystem, which is defined in terms of roles and functions required to provide value-added services. These roles and services comprise: 1) Resource Providers – approximately 350 organizations across Europe providing the ICT resources that allow grid and cloud compute and storage services to be provided to researchers; 2) National Grid Initiatives - such as the Finnish and Polish NGI, bind multiple national Resource Providers to offer management and technical services that enable the federation on a national level; 3) Technology Providers - provide the technology needed by EGI to integrate communities and deploy the user-oriented services; 4) EGI.eu - the European Coordination body, provides coordination services around governance, operations, security, policy and technology, offers marketing and outreach support and delivers technical services for European and global service provision; 5) Funding agencies support this collaboration at both a national and European level; 6) Researchers, which can be both large and small research collaborations, either National or European in nature, are the consumers of the compute and storage services. This work was co-funded by the European Commission through the FedSM project contract (312851) and EGI_InSPIRE (261323). 978-1-4799-0913-1/14/$31.00 ©2014 IEEE