Sudoku: Algorithm X

Link to project here.

Algorithm X is a recursive, nondeterministic, depth-first, backtracking algorithm that Donald Knuth (the creator) referred to as "the most obvious trial-and-error approach" to solve the exact cover problem.

Follow the link above to see the algorithm's procedure. Here's the algorithm as I understand it:

Check if there are unsatisfied columns in matrix A, if there aren't then the current solution is valid, exit.
Otherwise choose a column c to satisfy.
Choose a row r that satisfies c.
Add r to the partial solution.
For each column j that r also satisfies,
  Delete every row that would satisfy j.
  Delete j from A.
Repeat this algorithm recursively on the reduced A.

While looking into it, I noticed that most tutorials describe the concept of Algorithm X and the move on directly into the Dancing Links implementation of the algorithm called DLX. While it's easy to see why DLX would be more efficient, I decided to try and implement Algorithm X without it just to see the difference in both speed and complexity of implementation.

Implementation
Conclusion

Implementation

State Object

Like the previous basic search algorithms, this requires storage of the partial solution as we progress through it. Unlike them however, we can't simply store partial solutions as grid objects as it would be very expensive to encode and decode them at every step. So we'll need an extra object to store the current state of the algorithm, not just the grid. This information includes:

The exact cover matrix for the grid.
A lookup table to associate exact cover matrix row indices to the grid cell[i,j] with value v.
A set of the current partial solution
A set of the exact cover matrix rows that haven't been evaluated
For efficiency, a set of columns that are satisfied by the current partial solution.

Item number 1 and 2 both stay the same throughout the process, but are a part of the state object for convenience.

Lookup Table

As said in the section above, this table is to associate exact cover matrix row indices to the grid cell[i,j] with value v. This will allow us to reconstruct the grid simply from the row indices in the solution set.

All that is happening is we are looping through the values of i, j and v, and then mapping the row index returned by exactCover.getRowIndex() to the given combination.

const mkLookup = grid => {
  const gMatrix = grid.get("matrix")
  const symbols = grid.get("symbols")
  // Create maps of matrix row to grid coords and value for easy lookup
  return Immutable.Map().withMutations(mutable => {
    for (let i=0; i<gMatrix.count(); i++) {
      for (let j=0; j<gMatrix.count(); j++) {
        // Using ecMatrix row index as key
        symbols.forEach(v => {
          const row = exactCover.getRowIndex(i, j, v, grid)
          mutable.set(row, Immutable.Map({"i":i, "j":j, "v":v}))
        })
      }
    }
  })
} // End mkLookup

Solution Set

A given puzzle for sudoku is generally partially filled so that we have enough clues to reach a unique solution. In order to account for the cells that have already been filled, we have to pre-fill the solution set with the rows representing the existing cells.

Here, we loop through each cell of a given grid. If the cell contains a value, it will calculate the exact cover matrix row index that represents the cell and value combination and add it to the solution set.

const initSolution = grid => {
  const gMatrix = grid.get("matrix")
  const symbols = grid.get("symbols")
  return Immutable.Set().withMutations(mutable => {
    for (let i=0; i<gMatrix.count(); i++) {
      for (let j=0; j<gMatrix.count(); j++) {
        const v = gMatrix.getIn([i,j])
        if (v != " ") {
          // If value exists, then it is part of the solution
          const row = exactCover.getRowIndex(i, j, v, grid)
          mutable.add(row)
        }
      }
    }
  })
} // End initSolution

Satisfied Set

Step 4 of Algorithm X mentions deleting columns that are satisfied by rows that are part of the partial solution. Unfortunately, actually deleting columns with my implementation of the exact cover matrix would be quite tedious and expensive. So a much simpler way is to simply mask the columns. We do this by adding the satisfied columns to a set and then excluding them when searching for an unsatisfied column.

This is initially done by iterating through each row in the solution set, and the adding each column that the row satisfies to the set.

const initSatisfied = (solution, ecMatrix) => {
  return Immutable.Set().withMutations(mutable => {
    solution.forEach(row => ecMatrix.get(row).forEach((state, col) => { 
      if(state>0) mutable.add(col)
    }))
  })
} // End initSatisfied

Open Set

This set helps us track what rows are available for evaluation. This is basically Matrix A in the pseudocode.

Here we iterate through each cell in the grid looking for empty cells. When a cell is empty, we can add it to the open set. also take an additional steps to filter out rows that would clash with the existing solution. While there are multiple locations in the algorithm where this step can take place, but I found that this filtering needs to be done here to more easily enable a Greedy variant of Algorithm X.

const initOpen = (grid, ecMatrix, satisfied) => {
  const gMatrix = grid.get("matrix")
  const symbols = grid.get("symbols")
  return Immutable.Set().withMutations(mutable => {
    for (let i=0; i<gMatrix.count(); i++) {
      for (let j=0; j<gMatrix.count(); j++) {
        const v = gMatrix.getIn([i,j]);
        if (v == " ") {
          // If value doesn't exist, then the cell is open
          symbols.forEach(s => {
            // Only add rows that don't clash with existing solution
            const row = exactCover.getRowIndex(i, j, s, grid)
            const cols = ecMatrix.get(row).map((s,i) => s>0 ? i : -1).filter(col => col>=0)
            if(satisfied.intersect(cols).count()<=0) mutable.add(row)
          })
        }
      }
    }
  })
} // End initOpen

The Algorithm

First thing we need to do is choose an unsatisfied column to evaluate. This step is pretty simple since we have the satisfied set, so all we need to do is to iterate through the column indices of the exact cover matrix and find one that is not in the set.

Getting The Candidates

const getUnsatisfiedCol = state => { 
  for (let i=0; i<state.getIn(["ecMatrix", 0]).count(); i++) {
    if(!state.get("satisfied").has(i)) return i;
  }
  return -1;
}

Then using the selected column, we find all the rows (candidates) in the exact cover matrix that satisfy the column.

const getCandidates = (state, col) => state.get("open").filter(row => state.getIn(["ecMatrix", row]).get(col)>0)

Both these functions are wrapped into a single function that is called by getNext(). If there are no unsatisfied rows this also lets us exit early.

const getFirstCandidates = state => {
  const col = algoX.getUnsatisfiedCol(state)
  return col>=0 ? algoX.getCandidates(state, col) : Immutable.Set()
},

Updating the States

Now that we can get a set of candidates, we will we have to remove all candidates from the open set because they are all mutually exclusive from each other. Then we generate generate different state objects where each represents a state where we have used one candidate from the set. This is done by adding the candidate to the solutions set, updating the satisfied set with new columns and filtering out the rows that are now invalid due to adding the candidate to the solution set.

const getNext = (state, isGreedy) => {
  const candidates = isGreedy ? algoX.getBestCandidates(state) : algoX.getFirstCandidates(state)
  // Subtract both valid and invalid candidates so that states following this will not 
  // have to process candidates already shown to be invalid
  const nextState = state.set("open", state.get("open").subtract(candidates))
  return candidates
    //.filter(c => algoX.rowIsValid(nextState, c)) // filter option 2
    .map(c => nextState.withMutations(stateMutable => { // For each candidate,
      // Add to solution,
      stateMutable.set("solution", stateMutable.get("solution").add(c)) 
      // Add cols that candidate satisfies to satisfied set.
      stateMutable.set("satisfied", stateMutable.get("satisfied").union(algoX.getSatisfiedCols(stateMutable, c)))
      // Remove rows that have been made invalid by this candidate
      // filter option 1
      stateMutable.set("open", stateMutable.get("open").filter(row => algoX.rowIsValid(stateMutable, row)))
    }))
    .toList()
}

Notice that I have two options for filtering the candidates in the open set, the first being the at the state generation. Filtering at this location has the benefit of increasing the accuracy of the greedy selector (later section), since it will only be counting valid candidates as a result and lead us down fewer dead ends. The downside of this approach is that we end up running the filtering for states that may never get evaluated, which significantly slows down performance.

Option 2 on the other hand, passes the buck along to only states that are being evaluated. However, the greedy selector will end up ignoring columns that, while having a larger number of candidates, actually has the fewest numbers of valid ones. But then why not filter inside the greedy selector? Because it impacts performance far more than the first two options. Ultimately the correct decision will require some analysis.

Implementing the Search

With a way of getting the list of following states, what remains is simply to wrap a depth first search around it. The code used is essentially the same as the basic depth first search in Part 2 of this blog series, just with more housekeeping steps.

const solveStep = isGreedy => data => {
  const moves = data.get("moves")
  const state = data.get("state")

  if(algoX.solutionFound(data)) 
    // Only set isComplete to true if the solution is actually valid
    if(!data.getIn(["grid","isComplete"]) && fnGrid.validate(data.get("grid"))) {
      return data.setIn(["grid","isComplete"], true)
    } else return data
  else {
    // Solution wasn't found so get the list of moves from here and choose one
    const newMoves = moves.concat(algoX.getNext(state, isGreedy))
    return data.withMutations(mutable => {
      mutable.set("state", newMoves.last())
      mutable.set("grid", algoX.updateGrid(newMoves.last(), data.get("inputGrid")))
      mutable.set("moves", newMoves.pop())
    })
  }
}

Most notably, the grid object now needs to be explicitly updated based on the current partial solution. This uses the lookup table we discussed earlier to find the values associated with the row number. This then updates any cells that were not part of the initial puzzle grid.

const updateGrid = (state, grid) => {
  return grid.set("matrix", grid.get("matrix").withMutations(mutable => {
    state.get("solution").forEach(row => {
      const s = state.get("lookup").get(row)
      // If statement not strictly needed, just want to prevent overriding original input cells
      // to make possible bugs more obvious
      if(mutable.getIn([s.get("i"), s.get("j")])==" ") {
        mutable.setIn([s.get("i"), s.get("j")], s.get("v"))
      }
    })
  }))
}

Greedy Variant

While testing the solver, I realized that Algorithm X is essentially just a replacement of the isValidMove() and getEmptyCell() functions from the basic search implementation. It will still have the same drawbacks we found for depth first search. If that's the case, then it can also benefit from being greedy.

Instead of just dealing with the first unsatisfied column it comes across, we can look for columns that have the least candidates, which will produce the least number of branching paths.

The first thing we need is to get a list of all columns that have not been satisfied. This simple generates a set of numbers in a range up the number of columns in the exact cover matrix and then removes any have already been satisfied.

const getUnsatisfiedCols = state => {
  return fnArr.rangeSet(state.getIn(["ecMatrix", 0]).count())
    .subtract(state.get("satisfied"))
    .toList()
}

Then we can count the number of candidates that available for each column and pick out the column with the least. If there are columns with no valid candidates, we can immediately discard everything as this means that this path is guaranteed to lead to a dead end.

const getBestCandidates = state => {
  const cols = algoX.getUnsatisfiedCols(state)
  // Recursively iterate through the unsatisfied columns in the exact cover matrix to find the column with the least candidates
  const _getBestCandidates = (cols, i=0, best=Immutable.Set()) => {
    if(i<cols.count()) {
      const cs = algoX.getCandidates(state, cols.get(i))
      // If there is an unsatisfiable column, exit early, this state is invalid
      if (cs.count()<=0) return Immutable.Set()
      // If there is a column with less candidates, that is the new best and continue
      else if (i==0 || (cs.count()<best.count())) return _getBestCandidates(cols, i+1, cs)
      else return _getBestCandidates(cols, i+1, best) // If not, keep the current best
    } else return best
  }
  return _getBestCandidates(cols)
}

Conclusion

This algorithm is basically an alternative to the simple isValidMove() function in the basic search. Ultimately whether or not the additional complexity is worth it for Sudoku is up for debate, but I can see that it can be very useful for more complex games/tasks.

Also, this implementation so pretty slow as expected, can't wait to try out Dancing Links.