I've made a simple Sudoku class (in Python). Its main attributes are the following:

def __init__(self, it: iter = None):

    self.grid = [[None for _ in range(9)] for _ in range(9)]

    self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

    self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

    self.usedInBox = [[False for _ in range(10)] for _ in range(9)]



def writeCell(self, pos: tuple, val: int) -> bool: [...]



def eraseCell(self, pos: tuple): [...]

self.grid keeps track of the actual numbers in the Sudoku. self.usedInRow, self.usedInCol, and self.usedInBox are boolean arrays that keep track of which numbers have been used in each row/column/square. writeCell is a method that tries to write a number in an empty cell and returns whether it succeeded (according to the rules of Sudoku). Finally, eraseCell replaces a cell's content with None.

Inside the class I've written a recursive-backtracking solver method. When implementing the solver, I tried different approaches for selecting which cell to fill in next. The first thing I tried was simply starting from (0, 0) and incrementing the position left-to-right and top-to-bottom (with columns as the 'small' unit and rows as the 'big' unit).

But that's not general enough (if for some reason the solver started solving from a position other than (0, 0) every empty cell before that would be skipped). So I added a member to keep track of empty cells:

def __init__(self, it: iter = None):

    [...]

    self.emptyCells = [(i, j) for i in range(9) for j in range(9)]

I modified the writeCell and eraseCell methods to mantain the correct record. At this point the solver looks more or less like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells[0]



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: return False



    success = _solve()

    [...]

    return success

At this point I thought that maybe it would be better if the self.emptyCells member was a set instead of a list: I didn't want duplicates, didn't need indexed access (I thought that it wouldn't matter the order in which cells were filled in), and wanted fast deletion. So I modified the self.emptyCells attribute

self.emptyCells = set((i, j) for i in range(9) for j in range(9))

and the methods that referred to it (for example I used .discard() instead of .remove() in writeCell). The solver method now looked like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()  # Get arbitrary position



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: 

            self.emptyCells.add(pos)  # Backtrack

            return False



    success = _solve()

    [...]

    return success

To my surprise, the second approach (using a set) took about one minute to solve an empty Sudoku, whilst the first approach (using a list) took about two seconds. Why is that? Does the arbitrarily chosen position conferred by set.pop() affect the efficiency of the algorithm, compared to filling cells in order?

Full code

Using a list

import queue

import threading

import colorama



colorama.init()  # For ANSI escape codes





class Sudoku:

    @staticmethod

    def isSudoku(S: iter):

        return len(S) == 9 

               and all(len(row) == 9 for row in S) 

               and all(num in range(1, 10) for row in S for num in row)



    def __init__(self, it: iter = None):

        self.grid = [[None for _ in range(9)] for _ in range(9)]

        self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

        self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

        self.usedInBox = [[False for _ in range(10)] for _ in range(9)]

        self.emptyCells = [(i, j) for i in range(9) for j in range(9)]



        if it is not None:

            for i in range(9):

                for j, num in zip(range(9), it):

                    if num is not None:

                        if not self.writeCell((i, j), num): raise ValueError("Invalid Sudoku")



    def __iter__(self):

        return iter(self.grid)



    def __len__(self):

        return 9



    def __str__(self):

        def numsToStr(row):

            for num in row: yield num if num is not None else ' '



        def rowsBlock(i):

            yield ''

            for row in self.grid[3 * i:3 * (i + 1)]:

                yield '|{} {} {}|{} {} {}|{} {} {}|'.format(*numsToStr(row))

            yield ''



        def blocks():

            yield ''

            for i in range(3): yield 'n'.join(rowsBlock(i))

            yield ''



        return ('-' * 19).join(blocks())



    def __getitem__(self, key: tuple):

        if not len(key) == 2: raise TypeError("Two indices needed")

        r, c = key

        return self.grid[r][c]



    def __setitem__(self, key: tuple, value: int):

        if not len(key) == 2: raise TypeError("Two indices needed")

        self.eraseCell(key)

        self.writeCell(key, value)



    def _canWrite(self, r: int, c: int, b: int, val: int) -> bool:

        if self.usedInRow[r][val] or self.usedInCol[c][val] or self.usedInBox[b][val] 

            or self.grid[r][c] is not None: return False

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = True

        return True



    def writeCell(self, pos: tuple, val: int) -> bool:

        if val is None: return True  # Ignore writing none

        r, c = pos

        b = 3*(r//3) + c//3



        if not self._canWrite(r, c, b, val): return False

        # noinspection PyTypeChecker

        self.grid[r][c] = val

        self.emptyCells.remove(pos)

        return True



    def eraseCell(self, pos: tuple):

        r, c = pos

        b = 3*(r//3) + c//3



        val = self.grid[r][c]  # Get old value for reference

        if val is None: return  # If there wasn't anything in the first place

        self.grid[r][c] = None  # Actually erase

        self.emptyCells.append(pos)

        # noinspection PyTypeChecker

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False



    def solve(self) -> bool:

        printQueue = queue.PriorityQueue()



        def printer():

            while True:

                priority, item = printQueue.get()

                if item is StopAsyncIteration:

                    break

                print(item)

                print('x1b[13A', end='')

                printQueue.task_done()



        printThread = threading.Thread(target=printer, name="PrintThread")

        printThread.start()



        def _solve() -> bool:

            if not self.emptyCells: return True

            pos = self.emptyCells[0]



            for n in range(1, 10):

                if self.writeCell(pos, n):

                    if _solve(): return True

                    printQueue.put((1, self))

                    self.eraseCell(pos)

            else:

                return False



        success = _solve()

        printQueue.put((0, StopAsyncIteration))

        printThread.join()

        return success

Using a set

The only things that change are the following.

_solve() internal method:

    def _solve():

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                printQueue.put((1, self))

                self.eraseCell(pos)

        else:

            self.emptyCells.add(pos)

            return False

eraseCell and writeCell:

def writeCell(self, pos: tuple, val: int):

    if val is None: return True  # Ignore writing none

    r, c = pos

    b = 3*(r//3) + c//3



    if not self._canWrite(r, c, b, val): return False

    # noinspection PyTypeChecker

    self.grid[r][c] = val

    self.emptyCells.discard(pos)

    return True



def eraseCell(self, pos: tuple):

    r, c = pos

    b = 3*(r//3) + c//3



    val = self.grid[r][c]  # Get old value for reference

    if val is None: return  # If there wasn't anything in the first place

    self.grid[r][c] = None  # Actually erase

    self.emptyCells.add(pos)

    # noinspection PyTypeChecker

    self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False

Note

The reason I don't put back pos in the list version—and also why I get pos as self.emptyCells[0] instead of self.emptyCells.pop(0)—is because writeCell and eraseCell already take care of that: if writeCell succeeds in writing a (previously empty) cell, then it will call self.emptyCells.remove(pos)—and here's another reason why I'm not popping the list: otherwise I would be removeing a nonexistant element in the list, and would have to catch the ValueError.

After that, if backtracking occurs (i.e. the _solve() recursive call returns False), eraseCell will put pos back in the list. Thus, if the for loop "fails"—i.e., the else branch is entered—, the last eraseCell will already have put pos back.

On the other hand, when using a set, there is no way of accessing an element without removing it from the set (to my knowledge). Thus, I'm forced to pop from it and put it in back manually. That's why discard is used instead of remove in the set-version of writeCell, since pos may already have been popped by solve_.

An alternative maybe would be to save pos = self.emptyCells.pop() and add it straight back (self.emptyCells.add(pos)), and then proceed with solve_. Then I could change discard to remove.

Usage example

sudoku = Sudoku()

sudoku.solve()

print(sudoku)

Profiling results

I profiled both versions but didn't find any obvious culprit—that is, the proportions of cumulative time taken by each method were very similar, although the actual absolute time they took was just scaled up in the set version with respect to the list version.

Update — `pos` counts

I added an array to keep track of how many times each pos was selected to be explored, and I printed it at the end of the solve. Here are the results.

List version:

[[   1    1    1    1    1    1    1    1    1]

 [   1    1    1    1  145  671 7169 9869 7606]

 [ 774 1115  645    7 2066 1240  451  463  554]

 [ 511  466  504  424  554  422  358  502  566]

 [ 545  224  507  539  578  404  487  357  574]

 [ 421  421  452  436  465  406  326  621  483]

 [ 436  519  284  462  545  534  410  563  730]

 [ 433  561  311  397  365  254  308  708  384]

 [ 445  559  558  413  513  572  564  511  686]]

Set version (as far as I can tell, results are consistent):

[[ 19895      1  91239  66620      1      1  76794      1      1]

 [     1      1  77617  43061  95435      1      1  49617  77772]

 [ 54441      1      1  96081  79660      1      1      1  89575]

 [101233      1      1  85645  55172      1      1      1  54613]

 [     1      1  71512  61978   5531      1  76794  82692  74311]

 [ 96268      1  57136  92052      1      1  86104  91911      1]

 [     1  98119  95706  10680      1  97232  67210      1  39094]

 [ 53447      1      1      1  89846      1      1  88072      1]

 [ 74468  19103      1  39934  75698      1      1  90778  53260]]

edited Jan 12 at 13:35

asked Jan 2 at 16:09

Anakhand

426314

Nice code! Have you tried profiling, e.g. cProfile, to see where the performance boost takes place?

– Dascienz
Jan 4 at 17:25

@Dascienz Thanks! I've tried, but found no significant results. See the newly added last paragraph.

– Anakhand
Jan 4 at 19:11

Perhaps this answer would help you understand what's going on then? stackoverflow.com/questions/8929284/…

– Dascienz
Jan 4 at 19:20

@Dascienz Good resource! But the problem is exactly that. It's not the set that's faster than the list. It's the other way around, and not by little! Which is surprising... I guess it doesn't have to do with the data structure itself but rather the way set.pop() retrieves items (arbitrary instead of consistent).

– Anakhand
Jan 4 at 19:23

2

Did you count how may times the list vs the set function runs on the same sudoku? I suspect that the randomized version keeps adding and popping the same element several times. And therefore trying several paths more than once.

– swenzel
Jan 5 at 14:51

|
show 5 more comments

I've made a simple Sudoku class (in Python). Its main attributes are the following:

def __init__(self, it: iter = None):

    self.grid = [[None for _ in range(9)] for _ in range(9)]

    self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

    self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

    self.usedInBox = [[False for _ in range(10)] for _ in range(9)]



def writeCell(self, pos: tuple, val: int) -> bool: [...]



def eraseCell(self, pos: tuple): [...]

def __init__(self, it: iter = None):

    [...]

    self.emptyCells = [(i, j) for i in range(9) for j in range(9)]

I modified the writeCell and eraseCell methods to mantain the correct record. At this point the solver looks more or less like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells[0]



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: return False



    success = _solve()

    [...]

    return success

self.emptyCells = set((i, j) for i in range(9) for j in range(9))

and the methods that referred to it (for example I used .discard() instead of .remove() in writeCell). The solver method now looked like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()  # Get arbitrary position



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: 

            self.emptyCells.add(pos)  # Backtrack

            return False



    success = _solve()

    [...]

    return success

Full code

Using a list

import queue

import threading

import colorama



colorama.init()  # For ANSI escape codes





class Sudoku:

    @staticmethod

    def isSudoku(S: iter):

        return len(S) == 9 

               and all(len(row) == 9 for row in S) 

               and all(num in range(1, 10) for row in S for num in row)



    def __init__(self, it: iter = None):

        self.grid = [[None for _ in range(9)] for _ in range(9)]

        self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

        self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

        self.usedInBox = [[False for _ in range(10)] for _ in range(9)]

        self.emptyCells = [(i, j) for i in range(9) for j in range(9)]



        if it is not None:

            for i in range(9):

                for j, num in zip(range(9), it):

                    if num is not None:

                        if not self.writeCell((i, j), num): raise ValueError("Invalid Sudoku")



    def __iter__(self):

        return iter(self.grid)



    def __len__(self):

        return 9



    def __str__(self):

        def numsToStr(row):

            for num in row: yield num if num is not None else ' '



        def rowsBlock(i):

            yield ''

            for row in self.grid[3 * i:3 * (i + 1)]:

                yield '|{} {} {}|{} {} {}|{} {} {}|'.format(*numsToStr(row))

            yield ''



        def blocks():

            yield ''

            for i in range(3): yield 'n'.join(rowsBlock(i))

            yield ''



        return ('-' * 19).join(blocks())



    def __getitem__(self, key: tuple):

        if not len(key) == 2: raise TypeError("Two indices needed")

        r, c = key

        return self.grid[r][c]



    def __setitem__(self, key: tuple, value: int):

        if not len(key) == 2: raise TypeError("Two indices needed")

        self.eraseCell(key)

        self.writeCell(key, value)



    def _canWrite(self, r: int, c: int, b: int, val: int) -> bool:

        if self.usedInRow[r][val] or self.usedInCol[c][val] or self.usedInBox[b][val] 

            or self.grid[r][c] is not None: return False

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = True

        return True



    def writeCell(self, pos: tuple, val: int) -> bool:

        if val is None: return True  # Ignore writing none

        r, c = pos

        b = 3*(r//3) + c//3



        if not self._canWrite(r, c, b, val): return False

        # noinspection PyTypeChecker

        self.grid[r][c] = val

        self.emptyCells.remove(pos)

        return True



    def eraseCell(self, pos: tuple):

        r, c = pos

        b = 3*(r//3) + c//3



        val = self.grid[r][c]  # Get old value for reference

        if val is None: return  # If there wasn't anything in the first place

        self.grid[r][c] = None  # Actually erase

        self.emptyCells.append(pos)

        # noinspection PyTypeChecker

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False



    def solve(self) -> bool:

        printQueue = queue.PriorityQueue()



        def printer():

            while True:

                priority, item = printQueue.get()

                if item is StopAsyncIteration:

                    break

                print(item)

                print('x1b[13A', end='')

                printQueue.task_done()



        printThread = threading.Thread(target=printer, name="PrintThread")

        printThread.start()



        def _solve() -> bool:

            if not self.emptyCells: return True

            pos = self.emptyCells[0]



            for n in range(1, 10):

                if self.writeCell(pos, n):

                    if _solve(): return True

                    printQueue.put((1, self))

                    self.eraseCell(pos)

            else:

                return False



        success = _solve()

        printQueue.put((0, StopAsyncIteration))

        printThread.join()

        return success

Using a set

The only things that change are the following.

_solve() internal method:

    def _solve():

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                printQueue.put((1, self))

                self.eraseCell(pos)

        else:

            self.emptyCells.add(pos)

            return False

eraseCell and writeCell:

def writeCell(self, pos: tuple, val: int):

    if val is None: return True  # Ignore writing none

    r, c = pos

    b = 3*(r//3) + c//3



    if not self._canWrite(r, c, b, val): return False

    # noinspection PyTypeChecker

    self.grid[r][c] = val

    self.emptyCells.discard(pos)

    return True



def eraseCell(self, pos: tuple):

    r, c = pos

    b = 3*(r//3) + c//3



    val = self.grid[r][c]  # Get old value for reference

    if val is None: return  # If there wasn't anything in the first place

    self.grid[r][c] = None  # Actually erase

    self.emptyCells.add(pos)

    # noinspection PyTypeChecker

    self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False

Note

An alternative maybe would be to save pos = self.emptyCells.pop() and add it straight back (self.emptyCells.add(pos)), and then proceed with solve_. Then I could change discard to remove.

Usage example

sudoku = Sudoku()

sudoku.solve()

print(sudoku)

Profiling results

Update — `pos` counts

I added an array to keep track of how many times each pos was selected to be explored, and I printed it at the end of the solve. Here are the results.

List version:

[[   1    1    1    1    1    1    1    1    1]

 [   1    1    1    1  145  671 7169 9869 7606]

 [ 774 1115  645    7 2066 1240  451  463  554]

 [ 511  466  504  424  554  422  358  502  566]

 [ 545  224  507  539  578  404  487  357  574]

 [ 421  421  452  436  465  406  326  621  483]

 [ 436  519  284  462  545  534  410  563  730]

 [ 433  561  311  397  365  254  308  708  384]

 [ 445  559  558  413  513  572  564  511  686]]

Set version (as far as I can tell, results are consistent):

[[ 19895      1  91239  66620      1      1  76794      1      1]

 [     1      1  77617  43061  95435      1      1  49617  77772]

 [ 54441      1      1  96081  79660      1      1      1  89575]

 [101233      1      1  85645  55172      1      1      1  54613]

 [     1      1  71512  61978   5531      1  76794  82692  74311]

 [ 96268      1  57136  92052      1      1  86104  91911      1]

 [     1  98119  95706  10680      1  97232  67210      1  39094]

 [ 53447      1      1      1  89846      1      1  88072      1]

 [ 74468  19103      1  39934  75698      1      1  90778  53260]]

edited Jan 12 at 13:35

asked Jan 2 at 16:09

Anakhand

426314

Nice code! Have you tried profiling, e.g. cProfile, to see where the performance boost takes place?

– Dascienz
Jan 4 at 17:25

@Dascienz Thanks! I've tried, but found no significant results. See the newly added last paragraph.

– Anakhand
Jan 4 at 19:11

Perhaps this answer would help you understand what's going on then? stackoverflow.com/questions/8929284/…

– Dascienz
Jan 4 at 19:20

@Dascienz Good resource! But the problem is exactly that. It's not the set that's faster than the list. It's the other way around, and not by little! Which is surprising... I guess it doesn't have to do with the data structure itself but rather the way set.pop() retrieves items (arbitrary instead of consistent).

– Anakhand
Jan 4 at 19:23

2

Did you count how may times the list vs the set function runs on the same sudoku? I suspect that the randomized version keeps adding and popping the same element several times. And therefore trying several paths more than once.

– swenzel
Jan 5 at 14:51

|
show 5 more comments

I've made a simple Sudoku class (in Python). Its main attributes are the following:

def __init__(self, it: iter = None):

    self.grid = [[None for _ in range(9)] for _ in range(9)]

    self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

    self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

    self.usedInBox = [[False for _ in range(10)] for _ in range(9)]



def writeCell(self, pos: tuple, val: int) -> bool: [...]



def eraseCell(self, pos: tuple): [...]

def __init__(self, it: iter = None):

    [...]

    self.emptyCells = [(i, j) for i in range(9) for j in range(9)]

I modified the writeCell and eraseCell methods to mantain the correct record. At this point the solver looks more or less like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells[0]



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: return False



    success = _solve()

    [...]

    return success

self.emptyCells = set((i, j) for i in range(9) for j in range(9))

and the methods that referred to it (for example I used .discard() instead of .remove() in writeCell). The solver method now looked like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()  # Get arbitrary position



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: 

            self.emptyCells.add(pos)  # Backtrack

            return False



    success = _solve()

    [...]

    return success

Full code

Using a list

import queue

import threading

import colorama



colorama.init()  # For ANSI escape codes





class Sudoku:

    @staticmethod

    def isSudoku(S: iter):

        return len(S) == 9 

               and all(len(row) == 9 for row in S) 

               and all(num in range(1, 10) for row in S for num in row)



    def __init__(self, it: iter = None):

        self.grid = [[None for _ in range(9)] for _ in range(9)]

        self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

        self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

        self.usedInBox = [[False for _ in range(10)] for _ in range(9)]

        self.emptyCells = [(i, j) for i in range(9) for j in range(9)]



        if it is not None:

            for i in range(9):

                for j, num in zip(range(9), it):

                    if num is not None:

                        if not self.writeCell((i, j), num): raise ValueError("Invalid Sudoku")



    def __iter__(self):

        return iter(self.grid)



    def __len__(self):

        return 9



    def __str__(self):

        def numsToStr(row):

            for num in row: yield num if num is not None else ' '



        def rowsBlock(i):

            yield ''

            for row in self.grid[3 * i:3 * (i + 1)]:

                yield '|{} {} {}|{} {} {}|{} {} {}|'.format(*numsToStr(row))

            yield ''



        def blocks():

            yield ''

            for i in range(3): yield 'n'.join(rowsBlock(i))

            yield ''



        return ('-' * 19).join(blocks())



    def __getitem__(self, key: tuple):

        if not len(key) == 2: raise TypeError("Two indices needed")

        r, c = key

        return self.grid[r][c]



    def __setitem__(self, key: tuple, value: int):

        if not len(key) == 2: raise TypeError("Two indices needed")

        self.eraseCell(key)

        self.writeCell(key, value)



    def _canWrite(self, r: int, c: int, b: int, val: int) -> bool:

        if self.usedInRow[r][val] or self.usedInCol[c][val] or self.usedInBox[b][val] 

            or self.grid[r][c] is not None: return False

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = True

        return True



    def writeCell(self, pos: tuple, val: int) -> bool:

        if val is None: return True  # Ignore writing none

        r, c = pos

        b = 3*(r//3) + c//3



        if not self._canWrite(r, c, b, val): return False

        # noinspection PyTypeChecker

        self.grid[r][c] = val

        self.emptyCells.remove(pos)

        return True



    def eraseCell(self, pos: tuple):

        r, c = pos

        b = 3*(r//3) + c//3



        val = self.grid[r][c]  # Get old value for reference

        if val is None: return  # If there wasn't anything in the first place

        self.grid[r][c] = None  # Actually erase

        self.emptyCells.append(pos)

        # noinspection PyTypeChecker

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False



    def solve(self) -> bool:

        printQueue = queue.PriorityQueue()



        def printer():

            while True:

                priority, item = printQueue.get()

                if item is StopAsyncIteration:

                    break

                print(item)

                print('x1b[13A', end='')

                printQueue.task_done()



        printThread = threading.Thread(target=printer, name="PrintThread")

        printThread.start()



        def _solve() -> bool:

            if not self.emptyCells: return True

            pos = self.emptyCells[0]



            for n in range(1, 10):

                if self.writeCell(pos, n):

                    if _solve(): return True

                    printQueue.put((1, self))

                    self.eraseCell(pos)

            else:

                return False



        success = _solve()

        printQueue.put((0, StopAsyncIteration))

        printThread.join()

        return success

Using a set

The only things that change are the following.

_solve() internal method:

    def _solve():

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                printQueue.put((1, self))

                self.eraseCell(pos)

        else:

            self.emptyCells.add(pos)

            return False

eraseCell and writeCell:

def writeCell(self, pos: tuple, val: int):

    if val is None: return True  # Ignore writing none

    r, c = pos

    b = 3*(r//3) + c//3



    if not self._canWrite(r, c, b, val): return False

    # noinspection PyTypeChecker

    self.grid[r][c] = val

    self.emptyCells.discard(pos)

    return True



def eraseCell(self, pos: tuple):

    r, c = pos

    b = 3*(r//3) + c//3



    val = self.grid[r][c]  # Get old value for reference

    if val is None: return  # If there wasn't anything in the first place

    self.grid[r][c] = None  # Actually erase

    self.emptyCells.add(pos)

    # noinspection PyTypeChecker

    self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False

Note

An alternative maybe would be to save pos = self.emptyCells.pop() and add it straight back (self.emptyCells.add(pos)), and then proceed with solve_. Then I could change discard to remove.

Usage example

sudoku = Sudoku()

sudoku.solve()

print(sudoku)

Profiling results

Update — `pos` counts

I added an array to keep track of how many times each pos was selected to be explored, and I printed it at the end of the solve. Here are the results.

List version:

[[   1    1    1    1    1    1    1    1    1]

 [   1    1    1    1  145  671 7169 9869 7606]

 [ 774 1115  645    7 2066 1240  451  463  554]

 [ 511  466  504  424  554  422  358  502  566]

 [ 545  224  507  539  578  404  487  357  574]

 [ 421  421  452  436  465  406  326  621  483]

 [ 436  519  284  462  545  534  410  563  730]

 [ 433  561  311  397  365  254  308  708  384]

 [ 445  559  558  413  513  572  564  511  686]]

Set version (as far as I can tell, results are consistent):

[[ 19895      1  91239  66620      1      1  76794      1      1]

 [     1      1  77617  43061  95435      1      1  49617  77772]

 [ 54441      1      1  96081  79660      1      1      1  89575]

 [101233      1      1  85645  55172      1      1      1  54613]

 [     1      1  71512  61978   5531      1  76794  82692  74311]

 [ 96268      1  57136  92052      1      1  86104  91911      1]

 [     1  98119  95706  10680      1  97232  67210      1  39094]

 [ 53447      1      1      1  89846      1      1  88072      1]

 [ 74468  19103      1  39934  75698      1      1  90778  53260]]

edited Jan 12 at 13:35

asked Jan 2 at 16:09

Anakhand

426314

I've made a simple Sudoku class (in Python). Its main attributes are the following:

def __init__(self, it: iter = None):

    self.grid = [[None for _ in range(9)] for _ in range(9)]

    self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

    self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

    self.usedInBox = [[False for _ in range(10)] for _ in range(9)]



def writeCell(self, pos: tuple, val: int) -> bool: [...]



def eraseCell(self, pos: tuple): [...]

def __init__(self, it: iter = None):

    [...]

    self.emptyCells = [(i, j) for i in range(9) for j in range(9)]

I modified the writeCell and eraseCell methods to mantain the correct record. At this point the solver looks more or less like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells[0]



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: return False



    success = _solve()

    [...]

    return success

self.emptyCells = set((i, j) for i in range(9) for j in range(9))

and the methods that referred to it (for example I used .discard() instead of .remove() in writeCell). The solver method now looked like this:

def solve(self) -> bool:

    [...]



    def _solve() -> bool:

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()  # Get arbitrary position



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                self.eraseCell(pos)

        else: 

            self.emptyCells.add(pos)  # Backtrack

            return False



    success = _solve()

    [...]

    return success

Full code

Using a list

import queue

import threading

import colorama



colorama.init()  # For ANSI escape codes





class Sudoku:

    @staticmethod

    def isSudoku(S: iter):

        return len(S) == 9 

               and all(len(row) == 9 for row in S) 

               and all(num in range(1, 10) for row in S for num in row)



    def __init__(self, it: iter = None):

        self.grid = [[None for _ in range(9)] for _ in range(9)]

        self.usedInRow = [[False for _ in range(10)] for _ in range(9)]

        self.usedInCol = [[False for _ in range(10)] for _ in range(9)]

        self.usedInBox = [[False for _ in range(10)] for _ in range(9)]

        self.emptyCells = [(i, j) for i in range(9) for j in range(9)]



        if it is not None:

            for i in range(9):

                for j, num in zip(range(9), it):

                    if num is not None:

                        if not self.writeCell((i, j), num): raise ValueError("Invalid Sudoku")



    def __iter__(self):

        return iter(self.grid)



    def __len__(self):

        return 9



    def __str__(self):

        def numsToStr(row):

            for num in row: yield num if num is not None else ' '



        def rowsBlock(i):

            yield ''

            for row in self.grid[3 * i:3 * (i + 1)]:

                yield '|{} {} {}|{} {} {}|{} {} {}|'.format(*numsToStr(row))

            yield ''



        def blocks():

            yield ''

            for i in range(3): yield 'n'.join(rowsBlock(i))

            yield ''



        return ('-' * 19).join(blocks())



    def __getitem__(self, key: tuple):

        if not len(key) == 2: raise TypeError("Two indices needed")

        r, c = key

        return self.grid[r][c]



    def __setitem__(self, key: tuple, value: int):

        if not len(key) == 2: raise TypeError("Two indices needed")

        self.eraseCell(key)

        self.writeCell(key, value)



    def _canWrite(self, r: int, c: int, b: int, val: int) -> bool:

        if self.usedInRow[r][val] or self.usedInCol[c][val] or self.usedInBox[b][val] 

            or self.grid[r][c] is not None: return False

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = True

        return True



    def writeCell(self, pos: tuple, val: int) -> bool:

        if val is None: return True  # Ignore writing none

        r, c = pos

        b = 3*(r//3) + c//3



        if not self._canWrite(r, c, b, val): return False

        # noinspection PyTypeChecker

        self.grid[r][c] = val

        self.emptyCells.remove(pos)

        return True



    def eraseCell(self, pos: tuple):

        r, c = pos

        b = 3*(r//3) + c//3



        val = self.grid[r][c]  # Get old value for reference

        if val is None: return  # If there wasn't anything in the first place

        self.grid[r][c] = None  # Actually erase

        self.emptyCells.append(pos)

        # noinspection PyTypeChecker

        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False



    def solve(self) -> bool:

        printQueue = queue.PriorityQueue()



        def printer():

            while True:

                priority, item = printQueue.get()

                if item is StopAsyncIteration:

                    break

                print(item)

                print('x1b[13A', end='')

                printQueue.task_done()



        printThread = threading.Thread(target=printer, name="PrintThread")

        printThread.start()



        def _solve() -> bool:

            if not self.emptyCells: return True

            pos = self.emptyCells[0]



            for n in range(1, 10):

                if self.writeCell(pos, n):

                    if _solve(): return True

                    printQueue.put((1, self))

                    self.eraseCell(pos)

            else:

                return False



        success = _solve()

        printQueue.put((0, StopAsyncIteration))

        printThread.join()

        return success

Using a set

The only things that change are the following.

_solve() internal method:

    def _solve():

        if not self.emptyCells: return True

        pos = self.emptyCells.pop()



        for n in range(1, 10):

            if self.writeCell(pos, n):

                if _solve(): return True

                printQueue.put((1, self))

                self.eraseCell(pos)

        else:

            self.emptyCells.add(pos)

            return False

eraseCell and writeCell:

def writeCell(self, pos: tuple, val: int):

    if val is None: return True  # Ignore writing none

    r, c = pos

    b = 3*(r//3) + c//3



    if not self._canWrite(r, c, b, val): return False

    # noinspection PyTypeChecker

    self.grid[r][c] = val

    self.emptyCells.discard(pos)

    return True



def eraseCell(self, pos: tuple):

    r, c = pos

    b = 3*(r//3) + c//3



    val = self.grid[r][c]  # Get old value for reference

    if val is None: return  # If there wasn't anything in the first place

    self.grid[r][c] = None  # Actually erase

    self.emptyCells.add(pos)

    # noinspection PyTypeChecker

    self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False

Note

An alternative maybe would be to save pos = self.emptyCells.pop() and add it straight back (self.emptyCells.add(pos)), and then proceed with solve_. Then I could change discard to remove.

Usage example

sudoku = Sudoku()

sudoku.solve()

print(sudoku)

Profiling results

Update — `pos` counts

I added an array to keep track of how many times each pos was selected to be explored, and I printed it at the end of the solve. Here are the results.

List version:

[[   1    1    1    1    1    1    1    1    1]

 [   1    1    1    1  145  671 7169 9869 7606]

 [ 774 1115  645    7 2066 1240  451  463  554]

 [ 511  466  504  424  554  422  358  502  566]

 [ 545  224  507  539  578  404  487  357  574]

 [ 421  421  452  436  465  406  326  621  483]

 [ 436  519  284  462  545  534  410  563  730]

 [ 433  561  311  397  365  254  308  708  384]

 [ 445  559  558  413  513  572  564  511  686]]

Set version (as far as I can tell, results are consistent):

[[ 19895      1  91239  66620      1      1  76794      1      1]

 [     1      1  77617  43061  95435      1      1  49617  77772]

 [ 54441      1      1  96081  79660      1      1      1  89575]

 [101233      1      1  85645  55172      1      1      1  54613]

 [     1      1  71512  61978   5531      1  76794  82692  74311]

 [ 96268      1  57136  92052      1      1  86104  91911      1]

 [     1  98119  95706  10680      1  97232  67210      1  39094]

 [ 53447      1      1      1  89846      1      1  88072      1]

 [ 74468  19103      1  39934  75698      1      1  90778  53260]]

algorithm performance data-structures recursive-backtracking

edited Jan 12 at 13:35

asked Jan 2 at 16:09

Anakhand

426314

edited Jan 12 at 13:35

asked Jan 2 at 16:09

Anakhand

426314

edited Jan 12 at 13:35

asked Jan 2 at 16:09

Anakhand

426314

asked Jan 2 at 16:09

Anakhand

426314

asked Jan 2 at 16:09

Anakhand

426314

Nice code! Have you tried profiling, e.g. cProfile, to see where the performance boost takes place?

– Dascienz
Jan 4 at 17:25

@Dascienz Thanks! I've tried, but found no significant results. See the newly added last paragraph.

– Anakhand
Jan 4 at 19:11

Perhaps this answer would help you understand what's going on then? stackoverflow.com/questions/8929284/…

– Dascienz
Jan 4 at 19:20

@Dascienz Good resource! But the problem is exactly that. It's not the set that's faster than the list. It's the other way around, and not by little! Which is surprising... I guess it doesn't have to do with the data structure itself but rather the way set.pop() retrieves items (arbitrary instead of consistent).

– Anakhand
Jan 4 at 19:23

2

Did you count how may times the list vs the set function runs on the same sudoku? I suspect that the randomized version keeps adding and popping the same element several times. And therefore trying several paths more than once.

– swenzel
Jan 5 at 14:51

|
show 5 more comments

Nice code! Have you tried profiling, e.g. cProfile, to see where the performance boost takes place?

– Dascienz
Jan 4 at 17:25

@Dascienz Thanks! I've tried, but found no significant results. See the newly added last paragraph.

– Anakhand
Jan 4 at 19:11

Perhaps this answer would help you understand what's going on then? stackoverflow.com/questions/8929284/…

– Dascienz
Jan 4 at 19:20

@Dascienz Good resource! But the problem is exactly that. It's not the set that's faster than the list. It's the other way around, and not by little! Which is surprising... I guess it doesn't have to do with the data structure itself but rather the way set.pop() retrieves items (arbitrary instead of consistent).

– Anakhand
Jan 4 at 19:23

2

Did you count how may times the list vs the set function runs on the same sudoku? I suspect that the randomized version keeps adding and popping the same element several times. And therefore trying several paths more than once.

– swenzel
Jan 5 at 14:51

Nice code! Have you tried profiling, e.g. cProfile, to see where the performance boost takes place?

– Dascienz
Jan 4 at 17:25

@Dascienz Thanks! I've tried, but found no significant results. See the newly added last paragraph.

– Anakhand
Jan 4 at 19:11

Perhaps this answer would help you understand what's going on then? stackoverflow.com/questions/8929284/…

– Dascienz
Jan 4 at 19:20

@Dascienz Good resource! But the problem is exactly that. It's not the set that's faster than the list. It's the other way around, and not by little! Which is surprising... I guess it doesn't have to do with the data structure itself but rather the way set.pop() retrieves items (arbitrary instead of consistent).

– Anakhand
Jan 4 at 19:23

Did you count how may times the list vs the set function runs on the same sudoku? I suspect that the randomized version keeps adding and popping the same element several times. And therefore trying several paths more than once.

– swenzel
Jan 5 at 14:51

|
show 5 more comments

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54009544%2fwhy-does-arbitrary-selection-of-next-cell-affect-performance-in-backtracking-sud%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Category

Random preview

Why does arbitrary selection of next cell affect performance in backtracking Sudoku solver?

Full code

Using a list

Using a set

Note

Usage example

Profiling results

Update — pos counts

Full code

Using a list

Using a set

Note

Usage example

Profiling results

Update — pos counts

Full code

Using a list

Using a set

Note

Usage example

Profiling results

Update — pos counts

Full code

Using a list

Using a set

Note

Usage example

Profiling results

Update — pos counts

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

MongoDB - Not Authorized To Execute Command

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

How to fix TextFormField cause rebuild widget in Flutter

Update — `pos` counts

Update — `pos` counts

Update — `pos` counts

Update — `pos` counts