Neural Comput & Applic (1995)3:101-112 9 1995 Springer-Verlag London Limited Neural Computing & Applications Discrete Optimisation Based on the Combined Use of Reinforcement and Constraint Satisfaction Schemes A. Likas, D. Kontoravdis and A. Stafylopatis Department of Electrical and Computer Engineering, National Technical University of Athens, Computer Science Division, Athens, Greece A new approach is presented for finding near-optimal solutions to discrete optimisation problems that is based on the cooperation of two modules: an optimisation module and a constraint satisfaction module. The optimisation module must be able to search the problem state space through an iterative process of sampling and evaluating the generated samples. To evaluate a generated point, first a constraint satisfaction module is employed to map that point to another one satisfying the problem constraints, and then the cost of the new point is used as the evaluation of the original one. The scheme that we have adopted for testing the effectiveness of the method uses a reinforcement learning algorithm in the optimisation module and a general deterministic constraint satisfaction algorithm in the constraint satisfaction module. Experiments using this scheme for the solution of two optimisation problems indicate that the proposed approach is very effective in providing feasible solutions of acceptable quality. Keywords: Constraint satisfaction; Discrete optimis- ation; Graph partitioning; Higher-order Hopfield; Reinforcement learning; Set partitioning 1. Introduction and Motivation Discrete optimisation problems in their general formulation can be defined in terms of a tuple Received for publication 9 August 1994. Correspondence and offprint requests to: A. Likas, Department of Electrical and Computer Engineering, National Technical University of Athens, Computer Science Division, 157 73 Zographou, Athens, Greece. (S,S', fc), where S denotes the state space of the problem, S' _ S denotes the set of feasible states, i.e. those satisfying the problem constraints, and fc : S ~ 3t denotes the function that determines the cost of each state. The problem is to find a feasible state for which the cost function is optimal, i.e. to find a state s' E S' such that fc(S') is optimal in S' [1]. The conventional approach to tackle these prob- lems is to regard the given optimisation problem as a tuple (S,S,f'), where the function f has the form f' = fp + fc, with the function fp encoding the constraints of the problem and taking its optimum value in the case of feasible states. In this way, the problem is reduced to an unconstrained optimisation one consisting of finding a state s E S for which the function f' is optimal in S. Most search techniques applied to the solution of discrete optimisation problems are based on the above formulation which expresses the constraints through the penalty term fp. In that case, the optimisation procedure can be described as an iteration of a simple generate-test loop, as shown in Fig. 1. This approach is unavoidable when no a priori information about the function f is available. Thus, points are generated which belong to the domain of f and the evaluation of these points Generate T F s ~t Test Feedback Fig. I. The generate-test loop.