Distributed CTL model checking M. Bourahla Abstract: As model checking becomes increasingly used in industry, there is a big need for efficient new methods to deal with the large real-size designs. The author presents a novel method for improving the performance of model checking using parallelisation techniques. The model checking is performed on a distributed-memory environment consisting of a network of machines. The important two keys to focus on are the memory balance and communication reduction. A new algorithm for partitioning the large state space modelling industrial designs with hundreds of millions of states and transitions is proposed. The state space is supposed to be represented by a weighted Kripke structure (this is an extension of the Kripke structure where weights are associated with the states and with the transitions). This algorithm partitions the weighted Kripke structure by performing a combination of abstraction-partition-refinement on this structure. The CTL model checking algorithm is distributed on processes located on differ- ent network machines. Each one owns a partition and executes the algorithm on it. The algorithm for CTL model checking is designed to reduce the communication overhead between the processes. The experimental results on large real designs show that this method improves the quality of partitions, the communication overhead and then the overall performance of the model checking. 1 Introduction As formal verification becomes increasingly used in indus- try as a part of design processes, there is a constant need for efficient tool support to deal with real-size applications. There are many methods proposed to overcome this problem, including abstraction, partial order reduction, equivalence-based reduction, modular methods, and sym- metry [1–3]. Recently, a promising new method to tackle the state space explosion problem was introduced [4–10]. This method is based on the use of multiprocessor systems or workstation clusters. These systems often boast a very large (distributed) main memory. Furthermore, the large computational power of such systems also helps in effectively reducing model checking time. Model checking works by constructing a model (state space) of the system under design using a high-level description of the system, on which the desired correctness properties are verified. In this paper, we develop an efficient parallel model checking in terms of computation and communication. The following is a detailed description of our approach. The state space on which the model checking will be performed, is partitioned into M parts, where each part is owned by one process in the network. In order to increase the performance of the parallel model checking, it is essen- tial to achieve a good load balancing between the M machines, meaning that the M parts of the distributed state space should contain nearly the same number of states. The quality of a partitioning algorithm could also be estimated according to the number of cross-border tran- sitions of the partitioned state space (i.e. transitions having the source state in a component and the target state in another component). This number should be as small as possible, since it has an effect on the number of messages sent over the network during the analysis. We adopted a static partition scheme, which avoids the potential com- munication overhead occurring in dynamic load balancing schemes. This partitioning scheme has an adaptive cost which yields nearly equal partitions with small number of cross-border transitions. Then, the problem is to choose an appropriate partition algorithm associating to each state a machine index. The result of this algorithm is a partition- ing function P. Our algorithm for partitioning has three steps. The first step is the abstraction of the state space represented by a weighted Kripke structure using the matching notion [11–13] of pairs of states making a transition in the model (one is the source and the other is the target). This abstraction will continue until reduction of the state space to a certain number of states small enough to do the parti- tioning very easily. After partitioning this much smaller weighted Kripke structure to M partitions, this partitioning is projected back towards the original weighted Kripke structure (finer structure), by periodically performing refinements on the projected structures. When the partition- ing function P: S !f0, . . . , M 2 1g is produced, we proceed by cutting the transitions crossing the border, which produces M fragments where the border states are duplicated by a fashion satisfying the balancing condition. To parallelise model checking, we initiate a process on each machine from the M machines in the network. Each fragment is affected to a process. This allows reduction of both the amount of memory needed on each machine and the overall execution time. The processes perform a stan- dard model checking algorithm on their own components. However, this model checking algorithm can discover states that do not belong to the fragment the process owns. These states are sent to the process that owns them. To reduce the overhead of message transmission and to increase the overlapping between communications and # IEE, 2005 IEE Proceedings online no. 20050001 doi:10.1049/ip-sen:20050001 Paper first received 6th January and in revised form 12th April 2005 The author is with the Computer Science Department, University of Biskra, BP 145 RP, Biskra, Algeria, 07000 E-mail: mbourahla@hotmail.com IEE Proc.-Softw., Vol. 152, No. 6, December 2005 297