SATzilla: An Algorithm Portfolio for SAT * Eugene Nudelman Alex Devkar Yoav Shoham Department of Computer Science, Stanford University Kevin Leyton-Brown Holger Hoos Department of Computer Science, University of British Columbia 1 Introduction Inspired by the success of recent work in the con- straint programming community on typical-case complexity, in [3] we developed a new method- ology for using machine learning to study em- pirical hardness of hard problems on realistic distributions. In [2] we demonstrated that this new approach can be used to construct practical algorithm portfolios. In brief, the fact that algo- rithms for solving NP -hard problems are often relatively uncorrelated means that it is possi- ble for a portfolio to outperform all of its con- stituent algorithms. However, such uncorrela- tion is a knife that cuts both ways: a portfolio that makes bad choices among its constituent algorithms will often have much worse perfor- mance than any of its constituent algorithms. Our methodology can be outlined as follows: Oﬄine, as part of algorithm development: 1. Identify a target distribution of problem instances. 2. Select a set of algorithms having relatively uncorrelated runtimes on this distribution. 3. Using domain knowledge, identify features that characterize problem instances. 4. Compute features and determine algorithm running times. 5. Use regression to construct models of al- gorithms’ runtimes. Online, given an instance: 1. Compute feature values. 2. Predict each algorithm’s running time us- ing learned runtime models. 3. Run the algorithm predicted to be fastest. * See [5] for a complete discussion of SATzilla 2 SATzilla SATzilla is a portfolio of SAT solvers built ac- cording to the methodology described above. It includes the following solvers: 2clseq, Limmat, JeruSat, OKsolver, Relsat, Sato, Satz-rand, zChaff, eqSatz, Satzoo, kcnfs, and BerkMin. We began by assembling a broad library of about 5000 SAT instances, which we gathered from various public websites. We identiﬁed 83 features that could be computed quickly and that we felt might be useful for predicting run- time. We computed these features for our set of SAT instances, dropped some features that were highly correlated, and were left with 56 distinct features. In order to keep feature values to sensible ranges, as appropriate we normalized features by the total number of clauses or num- ber of variables. We also computed runtimes for each algorithm on each of our SAT instances. Given our features and runtime data, we had a well-deﬁned supervised learning problem. We built models using ridge regression, a machine learning technique that ﬁnds a linear model (a hyperplane in feature space) that minimizes a combination of root mean squared error and a penalty term for large coeﬃcients. To yield bet- ter models, we ignored all instances that were solved by all algorithms, by no algorithms, or as a side-eﬀect of feature computation. Upon execution, SATzilla begins by run- ning a UBCSAT [6] implementation of WalkSat for 30 seconds. In our experience, this step helps to ﬁlter out easy satisﬁable instances. Next, SATzilla runs the Hypre[1] preprocessor, which uses hyper-resolution to reason about binary clauses. This step is often able to dramatically shorten the formula, often resulting in search problems that are easier for DPLL-style solvers. Perhaps more importantly, the simpliﬁcation “cleans up” instances, allowing the subsequent analysis of 1