IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 4, APRIL 2009 1741
Sinkhorn Solves Sudoku
Todd K. Moon, Senior Member, IEEE, Jacob H. Gunther, Member, IEEE, and Joseph J. Kupin
Abstract—The Sudoku puzzle is a discrete constraint satisfac-
tion problem, as is the error correction decoding problem. We pro-
pose here an algorithm for solution to the Sinkhorn puzzle based on
Sinkhorn balancing. Sinkhorn balancing is an algorithm for pro-
jecting a matrix onto the space of doubly stochastic matrices. The
Sinkhorn balancing solver is capable of solving all but the most
difficult puzzles. A proof of convergence is presented, with some
information theoretic connections. A random generalization of the
Sudoku puzzle is presented, for which the Sinkhorn-based solver
is also very effective.
Index Terms—Belief propagation (BP), constraint satisfaction,
low-density parity-check (LDPC) decoding, Sinkhorn, Sudoku.
I. INTRODUCTION
T
HE Sudoku puzzle is a discrete constraint satisfaction
problem, where the constraints address uniqueness of
subsets of the puzzle, based on some observed “clues.” Error
correction decoding is also a discrete constraint satisfaction
problem, where the constraints address parity of subsets of the
observed values. The low-density parity-check (LDPC) belief
propagation (BP) decoding algorithm can be adapted to solve
Sudoku, since the puzzle has a Tanner graph representation.
On the other hand, a different solver for Sudoku may suggest
an alternative decoding algorithm, resulting in an interesting
interplay between Sudoku and error correction coding.
A Sudoku puzzle may be solved by logical elimination, as ex-
haustively described in [1], where nearly a dozen distinct rules
are described. By contrast, our method employs a probabilistic
representation of the puzzle. The probabilistic representation
provides a “suspension of belief” that allows searching of po-
tential solutions.
The connection between decoding and Sudoku has been
previously noted. The paper [2] gives an explicit formulation of
the BP algorithm for solving Sudoku. BP appears to work only
for easier puzzles, with the probable cause being the “loopy”
nature of the Tanner graph associated with the puzzle (all cells
are in cycles of length four). The presentation [3] also discusses
BP. In this paper, we present a solution algorithm based on
Sinkhorn balancing, which has both lower computational com-
plexity (per iteration) than BP and does not apparently suffer
from cycles in the graph. Sinkhorn balancing [4] is a means
of obtaining a unique doubly stochastic matrix from a (nearly)
arbitrary matrix. Extensions to produce matrices with arbi-
trary row and column sums appear in [5]. Sinkhorn balancing
Manuscript received December 13, 2006; revised May 08, 2007. Current ver-
sion published March 18, 2009.
T. K. Moon and J. H. Gunther are with the Electrical and Computer Engi-
neering Department, Utah State University, Logan, UT 84322 USA (e-mail:
Todd.Moon@usu.edu; jake@ece.usu.edu).
J. J. Kupin is with the Center for Communications Research, Princeton, NJ
08544 USA (e-mail: joseph.kupin@verizon.net).
Communicated by G. Seroussi, Associate Editor for Coding Theory.
Digital Object Identifier 10.1109/TIT.2009.2013004
(sometimes called Sinkhorn scaling) has been widely studied,
and makes its appearance in a variety of applications. (See, for
example [6].) The Sinkhorn balancing approach to solution is
successful at solving all but the most difficult Sudoku puzzles.
Sinkhorn balancing furthermore generalizes well to situations
in which clues are presented as random elements in a set.
As there are other methods of solving Sudoku puzzles, the
method presented here needs some justification. Our exploration
was motivated by a desire to develop decoding algorithms for
linear codes having many cycles in their Tanner graphs. While
the BP algorithm fares poorly for such codes (and Sudoku puz-
zles), the Sinkhorn balancing approach appears to be much more
robust. This success may in the future lead to insights on how
to apply Sinkhorn-like techniques to the decoding problem.
II. PUZZLE DESCRIPTION AND REPRESENTATION
A Sudoku puzzle is an grid of cells partitioned into
smaller blocks of elements each. The puzzle problem is
to fill in the cells so that the digits appear uniquely
in each row and column of the grid and in each block, starting
from some initial set of filled-in cells called “clues.” The unique-
ness requirement imposes constraints on the puzzle. The
following is an example of a puzzle:
We denote the contents of cell by for
, with cells numbered in row-scan order. The row
constraints are indexed by the numbers down the side
of the puzzle [as indicated in (1)]; the column constraints are in-
dexed by the numbers across the top of the puzzle.
[The box constraints are not shown in (1).] Constraint of the
puzzle is satisfied if all cells associated with it are distinct.
We model the contents of the cells probabilistically. Let
be the
probability row vector associated with cell , with individual
elements . Cells that are specified initially—the clue
cells—place all their probability mass on the specified value.
Thus, for the puzzle in (1)
etc.
0018-9448/$25.00 © 2009 IEEE