Sudoku: Puzzle Generation
Link to project here.
Aside from a solving the puzzles, I also wanted the to supply puzzles. It turns out that generating graded puzzles (as in grading them Easy, Medium or Hard) is actually not a straight forward process as explained by Gareth Rees and Andrew Stuart in their very informative articles. Thus, I just decided to make a way to randomly generate a puzzle without worrying about the difficulty grading as ultimately, what's hard for a computer is different for a human. The upside to this is I will be able to generate puzzles on the fly rather than having to pre-generate and store them in a database.
Generating Puzzles
When creating a puzzle for Sudoku, you actually need to start with a solution and then work your way back. So really, I need a random solution generator.
I can't place random values into random positions on the puzzle as that would be incredibly inefficient, and I can't start from a blank puzzle because DLX is a deterministic solver and it will always come up with the same solutions in the same order. Instead, I need to start with what I call a seed state, which is similar to what the two experts did.
Seed Generation
To generate a seed state, I just make an empty state, pick from a list of valid value without replacement and then place them into a random location in the puzzle. This ensures that there will be no clashing values. I had originally written this in Java using a different format to represent the Sudoku state, but thankfully I could reuse most of the code as is. I also found a functional way of converting a 1D array to a 2D array.
const mkSeed = size => {
const gridStr = fnGrid.VALID_SYMBOLS
.slice(0, size)
.padEnd(size*size, fnGrid.EMPTY_SYMBOL2)
return fnArr.shuffle(Immutable.List(gridStr.split("")))
// Reduce a 1D array to 2D array
.reduce((rows, value, index) => (index % size == 0 ? rows.push([value])
: rows[rows.length-1].push(value)) && rows, [])
.map(row => row.join(""))
.reduce((str, row) => str + "\n" + row)
}
Solving the Seed
After a seed state is generated, it has to be solved. I found that this method of seed generation does not actually guarantee a solvable state. For example, the state for size 4 puzzle "0100020030000000"
is technically valid while still not being solvable. So the solution generated needs to just keep trying until it finds a solvable state. Fortunately, chains of these tend to be relatively rare and does not significantly impact the puzzle generation time.
const mkSolution = size => {
while (true) {
const grid = fnGrid.importString2(generator.mkSeed(size))
const solved = dlx.solve(true)(grid)
if (solved.get("isComplete")) return solved
}
}
Reducing the Solution to a Puzzle
Once a solution is found, we just need to remove cells until we are satisfied with it. I've allowed the setting of number of empty cells as a primitive method of controlling difficulty. Ideally a Sudoku puzzle has only one solution, but I've also allowed for setting the maximum number of solutions because it could be useful for testing other solvers.
const mkPuzzle: (size, maxSolutions, maxEmptyCells, stepLimit=1000) => {
const indices = fnArr.range(size)
const positions = fnArr.shuffle(indices.flatMap(i => indices.map(j => [i, j]))).toJS()
let best = generator.mkSolution(size)
let nEmptyCells = 0
for (let i=0; i<positions.length; i++) {
const sol = best.withMutations(mutable => {
mutable.setIn(["matrix", positions[i][0], positions[i][1]], fnGrid.EMPTY_SYMBOL)
mutable.set("isComplete", false)
})
const data = dlx.search(sol, maxSolutions+1, stepLimit)
if (data.get("state").solutions.length>0 && data.get("state").solutions.length <= maxSolutions) {
best = sol
nEmptyCells += 1
if (nEmptyCells>=maxEmptyCells) return best
}
}
return best
}
Once a shuffled list of positions on the board is generated, I just iterate through it, removing the number at each position and checking if the resulting puzzle still meets the given criteria.
The implementation above is very inefficient and slower than I was expecting. Most of the computer's time is actually spent recreating the Dancing Links Matrix (DLM) on each iteration even though they always start out the same. I initially tried to modify dancing-links.js
to create a resettable DLM, however that seemed to cause some strange and undesirable behavior. So the task of making it more efficient will have to be handled by future me.