REVIEW PAPERS Survey of combined hardware–software reliability prediction approaches from architectural and system failure viewpoint Sourav Sinha 1 Neeraj Kumar Goyal 1 Rajib Mall 2 Received: 13 September 2017 / Revised: 24 April 2019 Ó The Society for Reliability Engineering, Quality and Operations Management (SREQOM), India and The Division of Operation and Maintenance, Lulea University of Technology, Sweden 2019 Abstract Apart from hardware and software-specific fail- ures, failures arising from hardware–software interaction causes notorious system failures. Researches have reported two types of interaction failures in a system: hardware- driven software failure and software-driven hardware fail- ure. An efficient reliability prediction approaches must consider all types of interactions. We critically analyse the existing reliability prediction models for the combined hardware–software system. We also propose a comparison framework to evaluate the existing reliability models for combined hardware–software systems. The results of our study suggest that none of the considered approaches completely satisfy the characteristics of a good reliability prediction model. Existing approaches hardly consider all types of hardware–software interactions. They also fail to consider reliability aspects of distributed systems where a system interacts with external devices. Our proposed comparison framework can be used as a benchmark to construct an efficient reliability prediction model for combined hardware–software systems. Keywords Reliability prediction Á Combined hardware– software system Á Interaction failure Á Architecture based reliability analysis Á NIMSAD 1 Introduction A good system reliability evaluation approach should also consider hardware–software (HW–SW) interaction fail- ures, along with hardware-specific and software-specific failures. For instance, degradation of hardware components due to electrical stress, temperature, fatigue, configuration changes or design susceptibilities may impact software operation. An experiment at Stanford University demon- strated that nearly 35% of software errors on an MVS/SP operating system are hardware-related (Iyer and Velardi 1985). It implies that faults in hardware components may also cause malfunctioning of the associated software. On the other hand, bugs in the software may lead associated hardware parts to failure. For example, Swedish JAS 39 Gripen fighter aircraft crashed in 1993 due to a bug in the flight-control software. Therefore, propagation of fault from hardware to software or software to hardware leads combined HW-SW system to failure. In recent time, a significant increase in the use of software-intensive hard- ware systems is observed in various safety–critical appli- ances. For example, software functionalities in defence aircraft have sharply increased from 8% for the F-4 in the 1960s to 80% for the F-22 in 2000 (Tumer and Smidts 2011). Therefore, we must adopt a suitable reliability evaluation technique for the software-intensive system. Several survey reports pertaining to reliability evalua- tion of independent hardware components and independent software components of a system are available in the lit- erature (Farr 1983, 1996; Immonen and Niemela ¨ 2008; Shanthikumar 1983; Yamada and Osaki 1983). However, successive research in this domain has identified system may not fail only due to hardware specific or software specific failures. Failures attributable to improper & Sourav Sinha sourav.sinha@iitkgp.ac.in 1 Subir Chowdhury School of Quality and Reliability, Indian Institute of Technology Kharagpur, Kharagpur 721302, India 2 Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, India 123 Int J Syst Assur Eng Manag https://doi.org/10.1007/s13198-019-00811-y