Arab J Sci Eng DOI 10.1007/s13369-017-2554-7 RESEARCH ARTICLE - COMPUTER ENGINEERING AND COMPUTER SCIENCE Quantification of Software Code Coverage Using Artificial Bee Colony Optimization Based on Markov Approach Muthusamy Boopathi 1 · Ramalingam Sujatha 1 · Chandran Senthil Kumar 2 · Srinivasan Narasimman 1 Received: 15 September 2016 / Accepted: 11 April 2017 © King Fahd University of Petroleum & Minerals 2017 Abstract Software test suite generation and development of techniques to optimize the test suite are vital parts of the soft- ware development life cycle. In this paper, a combination of Markov chain and artificial bee colony (ABC) optimization techniques are adopted to attain the software code coverage. Initially, dd-graph is captured from the control flow graph of the source code and is represented as a Markov chain. The number of paths is obtained based on linear code sequence and jump (LCSAJ) coverage. LCSAJ is used to reduce the number of independent paths as compared to paths obtained by basis path testing. Automatic test cases are generated and based on the operation profile of the test suite; transition probabilities are obtained using gcov analysis tool. Further, ABC optimization is adopted to ensure software code cov- erage. The initial population is randomly selected from the test suite and populated for subsequent generations using the ABC algorithm. The test cases are generated for three mixed data type variables, namely integer, float and Boolean. The quality of the test cases is improved during every itera- tion of ABC optimization and traversed through number of LCSAJ-based independent paths thereby ensuring software code coverage. Finally, software code coverage is quantified using the fitness/happiness value computed as a product of B Ramalingam Sujatha sujathar@ssn.edu.in Muthusamy Boopathi rithishboopathi@gmail.com Chandran Senthil Kumar cskumar@igcar.gov.in 1 Department of Mathematics, SSN College of Engineering, Kalavakkam 603110, India 2 Safety Research Institute, Atomic Energy Regulatory Board, Kalpakkam 603102, India node coverage and the corresponding transition probability values based on the path covered. Keywords Artificial bee colony · Markov chain · Dd-graph · LCSAJ coverage · Initial test suite · Path coverage 1 Introduction Software testing is one of the most important tasks in the software development life cycle to assure the reliability of a software. This task usually consumes at least 50% of the total cost involved in software development [1, 2] and a globally prevalent problem of detection of software piracy or theft dis- cussed in [3]. The basic aim of software testing is to execute the code and identify the presence of bugs. The probability of identifying the presence of bugs depends on the efficiency of the test suites. To minimize cost and time in the software development process, efficient test suites with maximum path and code coverage that expose as many faults as possible are required [1, 4]. In many industries, manual test data genera- tion or preparation process is a primary task with the onus of discovering the correct test input set falling on the tester. Even though the manual type of adequate test data selection is a lengthy and intensive process, it ensures that all the nuances of the code under test are examined properly [5]. Generally, software testing is a challenging task and is tedious especially when the complexity and size of the program are large. In 1976, Tom Mc-Cabe first introduced cyclomatic com- plexity as a software metric to quantify the complexity of the software [6, 7]. Basis path method is a white-box test- ing technique and helps the test case designers to derive a logical complexity measure of a procedural design and use this measure as a guide for defining a basis set of execution paths. In basis path testing, the number of independent paths 123