The TU Delft Sudoku Solver on FPGA Kees van der Bok #1 , Mottaqiallah Taouil #2 , Panagiotis Afratis #3 , Ioannis Sourdis #4 # Computer Engineering Delft University of Technology The Netherlands 1 C.vanderBok@student.tudelft.nl, 2 M.Taouil@tudelft.nl, 3 P.Afratis@student.tudelft.nl, 4 I.Sourdis@tudelft.nl Abstract—Solving Sudoku puzzles is a mind-bending activity that many people enjoy during their spare time. As such, for those being acquainted with computers, it becomes an irresistible challenge to build a computing engine for sudoku solving. Many sudoku solvers have been developed recently, using advanced techniques and algorithms to speed-up the computation. In this paper, we describe a hardware design for an FPGA implementa- tion of a sudoku solver. Furthermore, we show the performance of the above design for solving puzzles of order N 3 to 15. I. I NTRODUCTION Using an FPGA to solve a sudoku puzzle is an inter- esting challenge and valuable test case for general purpose algorithm execution on FPGA. Recently, executing general purpose applications on FPGA has grown in popularity and has proven to be more efﬁcient in many cases. Although FPGAs are much slower then GPPs, regarding the operating frequency, better performance can be achieved exploiting high degree of parallelism and customization, as well as the ability to keep data local. Before designing our sudoku solver we considered the available algorithms and how they could be mapped to an FPGA. Most algorithms turned out to have excessive resource requirements, either in memory size or in logic. Our design choice is a brute-force algorithm, the only design possibility that ﬁtted in the target FPGA device (Virtex2P-30). We expected the brute-force algorithm to be faster than a software version, because of the more efﬁcient way in which the valid symbols for a cell can be determined. Although we have improved the basic step of algorithm, the symbol selection, we have not solved the exhaustive nature of the brute-force solver (i.e. the solver may have to go through all the valid symbol assignments). The previously described issue causes the solver to become intractable for hard or large sudoku problems. Therefore, we conclude that the brute-force solver needs to be enriched with techniques that prune the search space. We explored the beneﬁts of ﬁlling empty cells in a particular order. Although the former reduces the solving time, it would not make the hard and large problems tractable. The following sections describe our brute-force sudoku solver implemented on an FPGA. The algorithm and design are explained and the performance is analyzed. II. SUDOKU SOLVING Automatic solving of sudoku puzzles can be done in various ways. There are a few well-known problems, for which algorithms exist, showing similarity to the sudoku problem. Solving a sudoku is, foremost, a constraint satisfaction prob- lem, but could also be regarded as an exact-cover problem, graph-coloring problem or binary-satisfaction problem. Besides converting the sudoku problem to a known problem, sudoku solvers that mimic the human solving scheme have been developed. This solving method is usually referred to as the elimination or solving-by-logic method. In essence the method is very closely related to the graph-coloring problem. The elimination method uses logic reasoning based on the constraints of the sudoku puzzle to exclude the symbols that can not be placed in a certain cell. The crux of the method is that if one candidate remains, when all others have been proven infeasible, that symbol can be ﬁlled in. Usually these solvers implement rules derived from the pen-and-paper methods. The algorithm requires the possible candidates to be kept in memory. This can be done by assigning to each cell of the puzzle a bitmap in which each bit represents a certain symbol (e.g. bit 0 represents symbol 1 etc.). A set bit identiﬁes that the symbol represented by that bit is a candidate for the cell the bitmap relates to. Elimination rules must be applied until one candidate remains (i.e. one and only one bit remains set in the bitmap). The cell can then be ﬁlled with the symbol represented by this bit. An advantage of this algorithm is that it is suitable for parallel execution. Unfortunately the elimination algorithm requires more memory than available on the FPGA. Storing the bitmaps requires N 4 * 225 bits, for each cell a 225-bit bitmap. For N = 15 we need 15 4 * 225 = 11390625 ≈ 12Mb This exceeds the available memory of the V2P30 FPGA, which offers only 1.4Mb of Block RAM. There are other, less memory demanding, methods of storing the sudoku information. However, these methods decrease the amount of information available and therefore weaken the strength of the elimination method considerably. Another issue we discovered when analyzing this method is that the algorithm is not complete. Therefore, the algorithm is not