A Comprehensive Analysis on Reusability of GP-Evolved Job Shop Dispatching Rules Yi Mei * , Mengjie Zhang * * School of Engineering and Computer Science, Victoria University of Wellington Wellington, 6012, New Zealand {yi.mei, mengjie.zhang}@ecs.vuw.ac.nz Abstract—Genetic Programming (GP) has been extensively used to automatically design dispatching rules for job shop scheduling problems. However, the previous studies only focus on the performance on the training instances. So far, there is no systematic investigation of the reusability of the GP-evolved rules on unseen instances. In practice, it is desirable to train the rules on smaller job shop instances, and apply them to larger instances with more jobs and machines to save training time. In this case, the reusability of the GP-evolved rules under different numbers of jobs and machines is an important issue. In this paper, a comprehensive investigation is conducted to analyse how the variation in the numbers of jobs and machines from the training set to the test set affects the reusability of the GP-evolved rules. It is found that in terms of minimizing makespan, the reusability of the GP-evolved rules highly depends on variation in the numbers of jobs and machines. A better reusability can be achieved by choosing training instances whose numbers of jobs and machines (or at least the ratio between the numbers of jobs and machines) are closer to that of the test instances. Furthermore, the ratio between the numbers of jobs and machines is demonstrated to be an important factor to reflect the complexity of an instance for dispatching rules. This study is the first systematic investigation on the reusability of GP-evolved dispatching rules. I. I NTRODUCTION Job Shop Scheduling (JSS) [1] is a classic scheduling problem with various applications in manufacturing industries and cloud computing. Due to its significance in practice, JSS has been extensively investigated in the past decades, and a large number of mathematical programming and heuristic algorithms have been proposed for solving it ([2], [3]). Dispatching rules have been widely used in real-world JSS problems, due to their simplicity and ease of application. More importantly, dispatching rules have a much better scalability than the meta-heuristic methods, and are flexible to be used in any unseen instance without any modification. Briefly speak- ing, a dispatching rule can generate a schedule by choosing the next operation to be processed by an idle machine according to some criterion. There have been numerous dispatching rules designed manually for various job shop situations and objectives, which have achieved reasonably good performance [4], [5], [6]. In addition, there are some studies on composite rules [7] and adaptive rule selection [8] from a pool of rules to improve the robustness of the rules in different job shop situations. Recently, automatic design of dispatching rules using Ge- netic Programming (GP) [9] is becoming prevalent, evidenced by the increasing number of studies in the recent decades ([10], [11], [12], [13]). GP formulates the rules as priority functions, which are represented as trees. By searching in the rule space, more hidden knowledge can be discovered by computer programs, and more complex and better rules can be obtained for JSS. It has been demonstrated that the rules generated by GP can perform much better than the existing manually designed rules on both static [12] and dynamic [13] JSS instances. Traditionally, when evolving dispatching rules with GP, a set of JSS instances are selected (from existing benchmarks or randomly generated) for evaluating the rules. That is, during the optimization process, the rules that generate better schedules for the selected JSS instances are considered to have better fitness values. In this sense, the optimization process can be seen as a rule training process, in which the selected JSS instances are the training instances. Most of the existing work focused on improving the performance of the rules on the training instances (i.e. the training performance). However, the reusability of the rules on unseen instances (i.e. the test performance), especially those with different properties, has not been investigated systematically. Some work was done recently [14], [12], [13], [15] to evaluate the trained rules on unseen instances with different number of machines and distri- butions, showing the potential of the GP-evolved dispatching rules in reusability and scalability. However, there has been no guideline on how to form the training set to improve the reusability of the rules. This paper aims to conduct a systematic study to analyse the reusability of the GP-evolved rules on unseen instances. In particular, motivated by training the rules on smaller instances and applying them to larger unseen instances to improve the training efficiency, we focus on the situation in which the test instances have larger numbers of jobs and machines than the training instances. The goal of this study is to discover how the variation in the numbers of jobs and machines from the training set to the test set affects the reusability of the GP- evolved dispatching rules, and identify the important factors that affect the relatedness of the training set to the test set. To this end, a number of experimental comparisons will be sophisticatedly designed to compare among the rules trained from different training sets on the same test set. Based on the knowledge learnt, we expect to propose a new guideline of selecting a more related training set so that the resultant GP-evolved rules are more reusable on the test set.