Ensuring Continuous Data Accuracy in AISEMA Systems Irina Diana Coman, Alberto Sillitti, Giancarlo Succi Center for Applied Software Engineering Free University of Bolzano Bolzano, Italy {IrinaDiana.Coman, Alberto.Sillitti, Giancarlo.Succi}@unibz.it Abstract — Automated In-process Software Engineering Measurement and Analysis (AISEMA) systems are powerful tools to monitor and improve the software development process. However, to be useful, it is required that such tools never stop working. Therefore, they need the support of advanced monitoring systems able to detect and locate malfunctions and inform automatically human operators providing all the information required to solve the problem. This paper describe the approach and the tools developed to support a specific AISEMA system developed to support both managers and developers in implementing continuous process improvement initiatives. AISEMA; development process; monitoring. I. INTRODUCTION The success of software measurement programs is strongly dependant on the automation of the related data collection (Pfleeger, 1993; Daskalantonakis, 1992; Offen and Jeffery, 1997; Hall and Fenton, 1997; Iversen and Mathiassen, 2000). Manual data collection suffers of several limitations including: it is time consuming, tedious, error prone, and often biased or delayed (Johnson and Disney, 1999). Semi-automated data collection is better (tools such as LEAP (Moore, 1999)) but there are still context switching problem (Johnson et al., 2003) with a negative impact on the performance of the developers since it requires to switch continuously between working activities and data collection. A new generation of tools (such as PROM (Sillitti et al., 2003) and Hackystat (Johnson et al., 2003)) has been developed to overcome such limitations providing a fully automated, non-invasive data collection. Such tools allow data collected from on-going projects to be used for improvement of the same project, therefore they are also called Automated In- process Software Engineering Measurement and Analysis (AISEMA) systems. AISEMA systems aim at automatically collecting the data, but also at providing tailored analyses for decision support. They reduce the cost of data collection, as they run in the background and let people focus on their work without any additional workload or distractions. They can collect a large variety of data. Based on these data, they propose: support for process management (Remencius et al., 2009; Danovaro et al., 2008), assessment of low-level processes (Coman and Sillitti, 2009), etc. Ensuring continuous data accuracy is one of the main challenges during the usage of an AISEMA system (Coman et al., 2009). The changes in the environment (such as software updates, software crashes, hardware failures, changes in security policies, etc.) affected sometimes the accuracy of data, mainly by disabling some of the data collection components, hindering data transfer or causing data loss. Not all such events are avoidable. Consequently, small amounts of data might be lost from time to time. However, it is very important to limit as much as possible these missing data and to have detailed information on the cause why they are missing and on their type. Such information helps to assess whether the missing data invalidate or not a specific analysis, thus ensuring reliable results of data analyses. In most of the cases, existing systems have already error prevention mechanisms located at each of theirs components. This is the case of the PROM system (Sillitti et al., 2003). Such mechanisms ensure a good functioning of the components. However, they cannot prevent, for instance, a silent disable of the components as result of repeated crashes of the host system. Moreover, in some cases, the disabling of the components is perfectly acceptable (for instance, when a developer chooses not to collect some specific data). Thus, to assess whether the disabled status of a component represents a failure of the system or not, additional context information is needed. In PROM, as the components interact with the server over the network medium, the correct functioning of the system as a whole does not depend only on the correct functioning of each individual component. Because the client components are not aware of their broader context (and are not meant to be aware), there is a need for a separate component that monitors the functioning of the system as a whole. Such component should identify potential problems and use local information from specific components to localize the actual cause of the problem. Additionally, such component should log the occurrence of the problem and, if the solution is known, to proceed with solving the problem. If the solution is unknown, the component should notify the maintainers of the system. The initial solution used during most of the case-study was to have a human performing such monitoring. However, this is