Rank-Cluster-and-Prune: An Algorithm for Generating Clusters in
Complex Set Partitioning Problems
Amy Cohn,
1
Michael Magazine,
2
George Polak
3
1
Department of Industrial and Operations Engineering, College of Engineering, University of Michigan, Ann Arbor,
Michigan 48109-2117
2
Quantitative Analysis and Operations Management, College of Business, University of Cincinnati, Cincinnati, Ohio 45221
3
Department of Information Systems and Operations Management, Raj Soin College of Business, Wright State University,
Dayton, Ohio 45435
Received 10 July 2007; revised 4 November 2008; accepted 22 November 2008
DOI 10.1002/nav.20343
Published online 24 February 2009 in Wiley InterScience (www.interscience.wiley.com).
Abstract: Clustering problems are often difficult to solve due to nonlinear cost functions and complicating constraints. Set parti-
tioning formulations can help overcome these challenges, but at the cost of a very large number of variables. Therefore, techniques
such as delayed column generation must be used to solve these large integer programs. The underlying pricing problem can
suffer from the same challenges (non-linear cost, complicating constraints) as the original problem, however, making a mathe-
matical programming approach intractable. Motivated by a real-world problem in printed circuit board (PCB) manufacturing, we
develop a search-based algorithm (Rank-Cluster-and-Prune) as an alternative, present computational results for the PCB problem to
demonstrate the tractability of our approach, and identify a broader class of clustering problems for which this approach can be
used. © 2009 Wiley Periodicals, Inc. Naval Research Logistics 56: 215–225, 2009
Keywords: set partitioning; branch-and-price; delayed column generation; branch-and-bound
1. INTRODUCTION
Clustering problems, in which a group of objects must be
divided into nonoverlapping and exhaustive subsets, appear
in a wide variety of applications, ranging from transportation
(e.g. [5]) to manufacturing (e.g. [29]) to scheduling MBA
cohorts (e.g. [15]). When the cost function and/or the rules
governing the feasibility of subsets are complex, a set parti-
tioning model can often be formulated to avoid a nonlinear
objective function and/or complicating constraints.
Unfortunately, such formulations typically possess an
exponential number of integer variables. Very large integer
programs can sometimes be solved with branch-and-price, an
application-customized algorithm that uses delayed column
generation as a way to solve the large-scale linear programs
embedded within the branch-and-bound tree. Column gen-
eration, however, requires the repeated solving of a pricing
problem to identify candidate variables with negative reduced
cost. [These techniques are briefly summarized in the next
section.] When a set partitioning formulation is used as a
Correspondence to: A. Cohn (amycohn@umich.edu)
way to bypass complex constraints and objective functions,
this complexity must instead be addressed in the pricing prob-
lem. Thus, mathematical programming (MP) approaches are
often inadequate for solving this pricing problem. This was
our experience in attempting to solve a real-world problem
in integrated printed circuit board (PCB) planning.
Motivated by this application, we have developed an alter-
native approach to the pricing problem, which we call Rank-
Cluster-and-Prune (RCP). RCP is a search-based technique
that, like branch-and-bound, uses a tree structure to enumer-
ate potential solutions. Rather than using linear programming
to construct the nodes, however, we make inclusion deci-
sions in an ordered way, allowing us to directly compute the
objective function. This is very powerful, as it enables us to
consider problems with a wide range of objective functions.
They need not be linear or convex. In fact, it is not even nec-
essary that we be able to write the objective function in closed
form. For example, we might compute it using Monte Carlo
simulation or a look-up table. The only restriction is that it
be nondecreasing in inclusion (i.e. when we add to a set it’s
cost does not go down). Pruning based on dual potentials
prevents the exhaustive enumeration of the solution space
© 2009 Wiley Periodicals, Inc.