The Impact of Search Heuristics on Heavy-Tailed Behaviour Tudor Hulubei and Barry O’Sullivan Cork Constraint Computation Centre Department of Computer Science, University College Cork, Ireland tudor@hulubei.net,b.osullivan@cs.ucc.ie Abstract. The heavy-tailed phenomenon that characterises the runtime distribu- tions of backtrack search procedures has received considerable attention over the past few years. Some have conjectured that heavy-tailed behaviour is largely due to the characteristics of the algorithm used. Others have conjectured that prob- lem structure is a significant contributor. In this paper we attempt to explore the former hypothesis, namely we study how variable and value ordering heuristics impact the heavy-tailedness of runtime distributions of backtrack search proce- dures. We demonstrate that heavy-tailed behaviour can be eliminated from par- ticular classes of random problems by carefully selecting the search heuristics, even when using chronological backtrack search. We also show that combina- tions of good search heuristics can eliminate heavy tails from quasigroups with holes of order 10 and 20, and give some insights into why this is the case. These results motivate a more detailed analysis of the effects that variable and value orderings can have on heavy-tailedness. We show how combinations of variable and value ordering heuristics can result in a runtime distribution being inherently heavy-tailed. Specifically, we show that even if we were to use an oracle to re- fute insoluble subtrees optimally, for some combinations of heuristics we would still observe heavy-tailed behaviour. Finally, we study the distributions of refuta- tion sizes found using different combinations of heuristics and gain some further insights into what characteristics tend to give rise to heavy-tailed behaviour. 1 Introduction The Italian-born Swiss economist Vilfredo Pareto first introduced the theory of non- standard probability distributions in 1897 in the context of income distribution. These distributions have been used to model many real-world phenomena, from weather fore- casting to stock market analysis. More recently, they have been used to model the cost of combinatorial search methods. Exceptionally hard instances have been observed amongst certain classes of constraint satisfaction problems, such as graph colouring [17], SAT [10], random problems [2, 13, 25, 26], and quasigroup completion problems [15]. In studying this phenomenon, researchers have used a wide range of systematic search algorithms such as chronological backtracking, forward-checking, Davis-Putnam and the Maintaining Arc Consistency algorithm (MAC) [24]. It is widely believed that the more sophisticated the search algorithm, the less likely it is that the exceptionally hard problem instances will be observed [8, 13].