Abstract—The traditional task of managing and monitoring a network has never been a trivial one. With recent changes in computing and networking, the area of distributed systems management faces new challenges and increasing complexity. Research in the relevant field reveals that, while there are many research and commercial solutions available, some of them are based on proprietary standards. Others focus on monitoring, while lacking the ability to actively make modifications and fine- tuning. Some others have a narrow target group. This paper proposes a framework for the management of distributed applications. The managed hosts are treated as integral parts of the deployment and not as stand alone, isolated entities. The framework is based on SNMP and is not limited to monitoring. On the contrary, it is capable of carrying out SNMP-SET commands, actively modifying the run-time parameters of the managed application. Finally, it can perform the management of a variety of distributed systems, ranging from small clusters to larger scale deployments such as computational or data grids. Index terms— Distributed applications management, SNMP I. INTRODUCTION The traditional task of managing and monitoring a network has never been a trivial one. Efforts have been made in that direction, a series of protocols have been designed and standards have been developed in order to facilitate the process. However, during the past decades the scenery in computing and networking has undergone revolutionary changes. From the era of single, centralised systems we have moved to an era of highly decentralised, interconnected nodes that share resources in order to provide services transparently to the end user. Similarly, legacy management systems targeted single nodes. The current paradigm presents new challenges and increases complexity in the area of network and systems management. There is need for new solutions that take a holistic approach, viewing a distributed deployment as a whole, instead of as a set of isolated hosts. There is a wide variety of products and standardisation groups offering a series of proposals to the problem. Deeper study of the aforementioned approaches reveals that some of them are based on non-open or proprietary standards. Others focus on monitoring, while lacking the ability to actively make modifications and fine-tuning. Finally, some of them have a narrow target group. This paper proposes a framework for the management of distributed applications. The hosts participating in the deployment are not treated as stand alone entities, isolated from the rest of the system. On the contrary, they are treated as integral parts of a larger system. The framework is modular and extensible. Users can build modules, customise it according to their needs and integrate it with their own deployments. Its core is based on the Simple Network Management Protocol (SNMP) [1] and is not limited to monitoring. On the contrary, it is capable of carrying out SNMP-SET commands, actively modifying the run-time parameters of the managed application. Finally, it can perform the management of a variety of distributed systems, ranging from small clusters to larger scale deployments such as computational or data grids. In the context of this paper, the term “management” is used to refer to all 5 categories of management defined in the FCAPS model (Fault, Configuration, Accounting, Performance, Security) [1]. Furthermore, the term “distributed” is used to describe a system, application or service that is hosted on multiple nodes interconnected over a network. Therefore this includes deployments such as distributed file systems, computer clusters, peer to peer networks and computational grids. However, multiprocessor, multi-core, parallel computing and similar systems are considered out of the scope of our work, even though they are very often referred to as “distributed”. Section II of this text briefly presents related research initiatives in the field of distributed systems management. The proposed framework’s architecture is described in section III. Section IV outlines a simple deployment and a scenario in which the framework could be used. This paper concludes with section V, discussing the framework’s advantages, compared to currently available solutions. II. RELATED WORK Thorough research in the field of distributed systems management reveals the fact that there are many research and commercial solutions available. It also reveals a host of new requirements that were not previously applicable in the case of traditional network management. Furthermore, contemporary applications have elevated demands in terms of availability and response time. Therefore, there is need for mechanisms to manage and monitor deployments. Fault detection, A Framework for the Management of Distributed Sy stems Based on SNMP George Oikonomou, Theodore Apostolopoulos Athens University of Economics and Business 76, Patission str., 104 34 Athens, Greece