Genetic Algorithm and Sudoku
I've recently been interested in Sudoku puzzles and I'm trying to figure out a way to generate the puzzles. At first I thought I could use a genetic algorithm to generate a puzzle through generation iterations as I heard a few people have already tried. Here lies the first stumbling block.
I have a decent fitness method and I can choose two puzzles from the population (randomly generated grid filled with numbers) but I cannot figure out a way to cross breed these two grids. Since there doesn't seem to be a way that would help determine a possible solution except by randomly copying numbers from one or the other.
Any suggestions?
BTW, does any one knows what determines the difficulty of a sudoku problem? Is it just the number of missing spaces.
I actually tried doing this about 2 months ago. As you mentioned, there doesn't seem to be a good crossover operator available. In my opinion even if you found a crossover operator that produced valid solutions, it would do little good anyway as it would likely just be effectively macro-mutation (and macro-mutation is highly unlikely to increase fitness).
I chose to just use mutation instead and focused on figuring out some good mutation operators instead. I implemented a substitution mutation and a swap mutation (swapped two random boxes in the grid).
Unfortunately my conclusion was that GA's are not well suited to this problem. The reason is that Suduko requires a perfect solution, wheras the evolutionary process is best suited for problems where near-perfect solutions are acceptable. In Sudoku near-perfect solutions are just wrong solutions.
Using a GA to solve suduko you can easily reach a local maximum where you have just two numbers in the wrong order, but it would take a long series of downhill mutations to reach the global optimum.
Unless you hybridise the algorithm with some sort of domain knowledge (which is kind of cheating) then you may have to leave your machine running for days to find the solution.
I chose to cheat instead. I found that suduko can actually be solved using a sequence of logical steps. In answer to your question about how the difficulty of Suduko puzzles is based, no it is not soley based on the number of missing spaces. There is a sequence of steps you can use to solve a suduko puzzle and more difficult puzzles will require more steps.
The first step I used was to store a list of possible values for each box in the grid. Then the program removes values from this list that cannot go in the box (ie if there is a 5 in the same row, or column as the box then 5 cannot go into the box, so remove 5 is removed from the list of possible values for that box).
After ruling out values for all the boxes in the grid, the next step is to find boxes with only one possible value left. You then obviously assign that value to the box, and then because the grid has changed, you redo step 1 above.
These steps alone can complete the easier Suduko puzzles, but with moderately hard puzzles you reach a stage where there are boxes not yet filled in, but they all have more than one possible value.
The next step is to look at each 3x3 subgrid seperately. You know the rule that a subgrid must contain the numbers 1 to 9. Take the numbers 1 to 9 in turn and check to see if they can only go in one box in the subgrid. If so then fill in that box, and redo from the first step.
The above steps will solve most moderate puzzles, and some hard. But some puzzles are even harder and I got stuck on that stage. My brain wasn't good enough to figure out the next logical step (I have no idea if there are more barriers after that).
Instead I just brute forced the rest, by treating it like a combination lock, and exiting early if a combination broke the suduko constraints. The above steps reduced the combinations enough so that it could be solved instantaneously.
I chose to just use mutation instead and focused on figuring out some good mutation operators instead. I implemented a substitution mutation and a swap mutation (swapped two random boxes in the grid).
Unfortunately my conclusion was that GA's are not well suited to this problem. The reason is that Suduko requires a perfect solution, wheras the evolutionary process is best suited for problems where near-perfect solutions are acceptable. In Sudoku near-perfect solutions are just wrong solutions.
Using a GA to solve suduko you can easily reach a local maximum where you have just two numbers in the wrong order, but it would take a long series of downhill mutations to reach the global optimum.
Unless you hybridise the algorithm with some sort of domain knowledge (which is kind of cheating) then you may have to leave your machine running for days to find the solution.
I chose to cheat instead. I found that suduko can actually be solved using a sequence of logical steps. In answer to your question about how the difficulty of Suduko puzzles is based, no it is not soley based on the number of missing spaces. There is a sequence of steps you can use to solve a suduko puzzle and more difficult puzzles will require more steps.
The first step I used was to store a list of possible values for each box in the grid. Then the program removes values from this list that cannot go in the box (ie if there is a 5 in the same row, or column as the box then 5 cannot go into the box, so remove 5 is removed from the list of possible values for that box).
After ruling out values for all the boxes in the grid, the next step is to find boxes with only one possible value left. You then obviously assign that value to the box, and then because the grid has changed, you redo step 1 above.
These steps alone can complete the easier Suduko puzzles, but with moderately hard puzzles you reach a stage where there are boxes not yet filled in, but they all have more than one possible value.
The next step is to look at each 3x3 subgrid seperately. You know the rule that a subgrid must contain the numbers 1 to 9. Take the numbers 1 to 9 in turn and check to see if they can only go in one box in the subgrid. If so then fill in that box, and redo from the first step.
The above steps will solve most moderate puzzles, and some hard. But some puzzles are even harder and I got stuck on that stage. My brain wasn't good enough to figure out the next logical step (I have no idea if there are more barriers after that).
Instead I just brute forced the rest, by treating it like a combination lock, and exiting early if a combination broke the suduko constraints. The above steps reduced the combinations enough so that it could be solved instantaneously.
I fully agree with RunningInt. Sudoku is not a puzzle that will be solved very well at all with a genetic algorithm. The fact that there are strategies and various logical steps necessary says it all. The best way to implement any AI to solve Sudoku is to program it to use the same techniques a human player would use in solving the puzzle. You can easily find these anywhere and in fact using them creates for a very fast puzzle solver ( as opposed to brute force... ).
Sudoku is a prime example of a class of problems known as Constraint Satisfaction Problems. The formal definition of a CSP is as follows (taken from my AI professors website):
A CSP is a triplet {V, D, C}, where V is a finite number of variables, V={V1, V2, ..., Vn}.
Each variable may be assigned a value from a domain D.
Each member of C is a pair: the first element of the pair is a set of variables, and the second element is the set of legal values for those sets. (Note that C is usually represented with a function, rather than explicitly).
The method to solve such a problem is depth first search with constraint propogation (detailed in the website I cited earlier). The basic idea is for each variable, determine the set of all possible values for it. You perform a depth first search, where every time you try setting a variable, you check that variable with all of the constraints, eliminating illegal values from other variables. You then propogate these changes recursively, doing a similar check for each variable possible set that you changed. If you ever prune a set to the empty set, then you end that branch of the search and backtrack.
In practice, the constraint propogation "pre-processing" will sometimes solve the problem with no search at all. If it doesn't, it usually doesn't take many iterations of the search to find a solution.
A CSP is a triplet {V, D, C}, where V is a finite number of variables, V={V1, V2, ..., Vn}.
Each variable may be assigned a value from a domain D.
Each member of C is a pair: the first element of the pair is a set of variables, and the second element is the set of legal values for those sets. (Note that C is usually represented with a function, rather than explicitly).
The method to solve such a problem is depth first search with constraint propogation (detailed in the website I cited earlier). The basic idea is for each variable, determine the set of all possible values for it. You perform a depth first search, where every time you try setting a variable, you check that variable with all of the constraints, eliminating illegal values from other variables. You then propogate these changes recursively, doing a similar check for each variable possible set that you changed. If you ever prune a set to the empty set, then you end that branch of the search and backtrack.
In practice, the constraint propogation "pre-processing" will sometimes solve the problem with no search at all. If it doesn't, it usually doesn't take many iterations of the search to find a solution.
Have the last 3 posters all missed the fact that the original poster asked to "generate the puzzles" which is not the same as "solving the puzzles"?
LOL yes. whoops. I guess generating one follows a similar method though. First you have to effectively solve one, and then get rid of some of the answers.
Quote: Original post by Kylotan
Have the last 3 posters all missed the fact that the original poster asked to "generate the puzzles" which is not the same as "solving the puzzles"?
Quote:
Since there doesn't seem to be a way that would help determine a possible solution except by randomly copying numbers from one or the other.
His post was kind of confusing. He said he wanted to make puzzles, but he also said he was having difficulty getting the computer to solve them. I was answering the latter question.
Additionally, as RunningInt touched on, making a puzzle is essentially solving a puzzle with no constraints, and then replacing some of the squares with blanks. By randomizing which node you expand in the depth first search part of constraint propogation, you can create random solutions. The problem is then reduced to finding a strategy of covering squares given a certain difficulty, checking to make sure the end result is solvable.
Why do you care about cross-over? Is random 'mutation' not enough?
Will
Will
------------------http://www.nentari.com
Quote: Original post by RPGeezus
Why do you care about cross-over? Is random 'mutation' not enough?
Will
Well, some believe that crossover is the only thing needed in GAs, while others believe that the whole crossover concept is worthless.
In this case, due to the high level of dependency among the values in the candidate solution, crossover may not be exactly possible.
As for generating a problem, I think using a GA may not be the best idea for directly generating a problem. However, you could use a GA to generate an end result, then use a reverse solver to start plucking out values.
I was planning on making a little Sudoku game, and the method for generating the puzzles I thought up is as follows.
Basically the there are two phases:
Phase one: Fill the board.
To fill the board you use a simple iterative search. For each box you determine the valid locations for the current number and then place it in a random spot. Then move to the next box. This repeated until the current number is placed in all boxes, then you move to the next number and repeat. Eventually you will have a completed Sudoku.
Phase two: determine values to show:
Once you have a completed board you create a play board by determining the number values to show based on the difficulty level and then show that many random values and you are done.
Basically the there are two phases:
Phase one: Fill the board.
To fill the board you use a simple iterative search. For each box you determine the valid locations for the current number and then place it in a random spot. Then move to the next box. This repeated until the current number is placed in all boxes, then you move to the next number and repeat. Eventually you will have a completed Sudoku.
Phase two: determine values to show:
Once you have a completed board you create a play board by determining the number values to show based on the difficulty level and then show that many random values and you are done.
Writing Blog: The Aspiring Writer
Novels:
Legacy - Black Prince Saga Book One - By Alexander Ballard (Free this week)
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement