Assembling Multiple-Case Studies: Potential, Principles and Practical Considerations Aiko Yamashita Mesan AS & Simula Research Laboratory Oslo, Norway aiko@simula.no Leon Moonen Certus Centre, Simula Research Laboratory Oslo, Norway leon.moonen@computer.org ABSTRACT Case studies are a research method aimed at holistically an- alyzing a phenomenon in its context. Despite the fact that they cannot be used to answer the same precise research questions as, e.g., can be addressed by controlled experi- ments, case studies can cope much better with situations having several variables of interest, multiple sources of evi- dence, or rich contexts that cannot be controlled or isolated. As such, case studies are a promising instrument to study the complex phenomena at play in Software Engineering. However, the use of case studies as research methodol- ogy entails certain challenges. We argue that one of the biggest challenges is the case selection bias when conduct- ing multiple-case studies. In practice, cases are frequently selected based on their availability, without appropriate con- trol over moderator factors. This hinders the level of com- parability across cases, leading to internal validity issues. In this paper, we discuss the notion of assembling cases as a plausible alternative to selecting cases to overcome the se- lection bias problem when conducting multiple-case studies. In addition, we present and discuss our experiences from ap- plying this approach in a study designed to investigate the impact of software design on maintainability. Categories and Subject Descriptors D.2 [Software]: Software Engineering General Terms theory, design, experimentation. Keywords empirical studies; methodology; case study; internal validity 1. INTRODUCTION A case study is an empirical inquiry or research strategy that “investigates a contemporary instance or phenomenon within its real-life context, particularly when boundaries be- tween instance or phenomenon and context are not clear”[24]. In Software Engineering (SE) research, the popularity of Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. EASE ’14, May 13 - 14 2014, London, England, BC, United Kingdom Copyright 2014 ACM 978-1-4503-2476-2/14/05 ...$15.00. http://dx.doi.org/10.1145/2601248.2601286 case studies is growing, although its practice is not as mature as other disciplines (e.g., Social research and Information Systems research). Runeson et al. argue for the adequacy of case study methodology for many types of SE research because the objects of study are indeed contemporary phe- nomena that are hard to study in isolation [16]. Case study research can be based on both quantitative and qualitative evidence and it encompasses a wide set of system- atic techniques (i.e., data collection, analysis, and reporting of the results) [24]. Case studies can provide a deeper in- sight into key aspects that can be investigated to develop or confirm theories which explain an observed phenomenon [9]. However, despite the versatility and advantages of this methodology, its usage also entails some challenges. We be- lieve that one important challenge is due to selection bias [11] during the selection of cases. In multiple-case studies, the units of analysis normally need to be selected to have a vari- ation in the properties that the study intends to compare. In practice, cases are often selected based on their availability, without exercising appropriate control over moderator fac- tors (often of contextual nature). This, in turn, can affect the internal validity of the findings, e.g., because the pres- ence of dissimilar contextual factors make it more difficult to validate results across the multiple cases. In this paper, we discuss the notion of assembling cases as a plausible alternative to selecting cases, to achieve bet- ter control over contextual factors across cases and tackle the selection bias challenge. We believe that construct- ing real-life situations (cases), with some level of control over certain variables and/or contextual factors is a feasible and purposeful approach to conduct empirical studies in SE. This approach enables the study of phenomena that is closer to real-life than what is normally attainable in experimen- tal settings (thus, richer in details), and at the same time it adds elements of control that can facilitate comparison across cases, tackling the problem of case selection bias and reducing threats to internal validity. Furthermore, we illus- trate how this notion was applied in a multiple-case study, where we attempted to control for moderator factors and conducted case replication [24]. We report lessons and prac- tical considerations stemming from our experiences with this case study, and discuss the implications, limitations, and ad- equacy of this approach for SE research. The remainder of this paper is structured as follows: Sec- tion 2 briefly discusses the advantages and limitations of case study research inspired on the discussion by George and Bennett [11]. Section 3 and introduces the notion of assem- bling case studies. Section 4 illustrates this approach by de-