Narciso et al. Test Case Selection Using CBIR and Clustering Proceedings of the Nineteenth Americas Conference on Information Systems, Chicago, Illinois, August 15-17, 2013. 1 Test Case Selection Using CBIR and Clustering  Everton Note Narciso School of Arts, Sciences and Humanities - EACH University of São Paulo - Brazil evernarciso@usp.br Márcio Eduardo Delamaro Institute of Mathematics and Computer Sciences University of São Paulo - Brazil delamaro@icmc.usp.br Fátima de Lourdes dos Santos Nunes School of Arts, Sciences and Humanities - EACH University of São Paulo - Brazil fatima.nunes@usp.br ABSTRACT Choosing test cases for the optimization process of information systems testing is crucial, because it helps to eliminate unnecessary and redundant testing data. However, its use in systems that address complex domains (e.g. images) is still underexplored. This paper presents a new approach that uses Content-Based Image Retrieval (CBIR), similarity functions and clustering techniques to select test cases from an image-based test suite. Two experiments performed on an image processing system show that our approach, when compared with random tests, can significantly enhance the performance of tests execution by reducing the test cases required to find a fault. The results also show the potential use of CBIR for information abstraction, as well as the effectiveness of similarity functions and clustering for test case selection. Keywords Test case selection, content-based image retrieval, clustering, similarity functions. INTRODUCTION The methods used during the software testing phase directly affect the product’s final quality. Manual and unplanned testing usually implies doubtful reliability in terms of software production, not meeting the desired requirements. Automated testing is a process that seeks to minimize the subjectivity of manual testing and optimize the available resources. To create mechanisms that exploit software requirements with the least possible computational effort represents a major challenge, and in this scenario the tools for test case selection are vital to determine the test strategies. The main objective to select test cases regards eliminating redundant or unnecessary test data. There are many approaches to select test cases for systems under object-oriented paradigm, embedded systems and systems with alphanumeric inputs/outputs in general (Engstrom, Skoglund and Runeson 2008; Yoo and Harman 2012). However, there is a gap with regards to selecting test cases for graphic domain systems, such as image processing systems. To test a system using a large number of images requires high computational costs. Considering the impracticality of this alternative, we present a new approach for selecting test cases. Our approach focuses on image processing systems, which consists of applying Content-Based Image Retrieval (CBIR), similarity functions and clustering techniques in order to select more relevant test cases (images), based on their characteristics. To achieve the proposed objective, the paper was organized as follows: Section 2 presents the basic concepts of test case selection, image processing, CBIR, similarity functions and clustering - fundamental topics to develop the proposal. Section 3 presents the methodology of the experiments. Section 4 shows the results. Section 5 presents the discussions and threats to validity. Section 6 presents the related works and Section 7 concludes the paper. brought to you by CORE View metadata, citation and similar papers at core.ac.uk provided by AIS Electronic Library (AISeL)