Genetic Algorithm and Sudoku

AndyTang · 2005-12-07T12:59:00

I've recently been interested in Sudoku puzzles and I'm trying to figure out a way to generate the puzzles. At first I thought I could use a genetic algorithm to generate a puzzle through generation iterations as I heard a few people have already tried. Here lies the first stumbling block. I have a decent fitness method and I can choose two puzzles from the population (randomly generated grid filled with numbers) but I cannot figure out a way to cross breed these two grids. Since there doesn't seem to be a way that would help determine a possible solution except by randomly copying numbers from one or the other. Any suggestions? BTW, does any one knows what determines the difficulty of a sudoku problem? Is it just the number of missing spaces.

Artificial Intelligence Programming

Started by AndyTang November 04, 2005 11:22 AM

63 comments, last by Rockoon1 19 years, 2 months ago

Timkin

864

December 06, 2005 07:06 PM

Jotaf: obviously when taken with regard to a specific population, a schema is just a short hand notation for that set of individuals, which, as you say, has a particular probability distribution over allele values associated with it. However, that wasn't the point I was trying to get at by describing schema... and you have touched on the heart of the matter, which I was trying to describe. The more general the schema, the larger the proportion of the population it represents and correspondingly, the (typically) larger the spread in 'fitness' values represented within the subpopulation the schema represents. Furthermore, if you asked yourself what was the expect fitness of a member generated according to that subpopulation, then yes, it's the fitness of the expected member selected according to the aforementioned probability distribution.

If I write '#0##110#' though, without linking it to a specific population, then I'm talking about the full set of chromosomes that match that schema and my discussion is with reference to that theoretical population. Yes, that population has uniform statistics, but they're actually not important. What is important is that under some very mild assumptions, it's fairly easy to show that '#0##11#' represents a more diverse population than say '100#11#', with correspondingly broader density function over fitness values.

The key difference between talking about a general schema and one related to a specific population is actually the distribution of the schemata over subsets of the population represented by the most general schema. For two parents, there is only 1 schema that will be processed and so the GA can make a decision at that time, subject to evaluation of the offspring, as to the quality of that schema. If you start using more parents, that decision is far weaker and it will take more time for the population and algorithm to resolve the issue of whether or not the schemata represented by the parent set are of good quality (some might be good while others bad, causing some propogation of mixed good and bad information into the population).

I hope this makes my point clearer.

Cheers,

Timkin

kirkd

505

December 07, 2005 11:43 AM

Reading through the posts, there seems to be one piece missing. Timkin eludes to it in his shcemata explanation, but perhaps it is beneficial to state it explicitly.

Cross-over works in domains where partial solutions can exist and contribute to the overall solution independently of other partial solutions. This may seem obvious, but I believe it warrants mentioning. Going to the bimodal example, let's suppose that solution A and solution B are near but not at the top of their maxima. It is possible that they each represent exceptional but not optimal combined partial solution. By using cross-over, you can accomplish recombining the contextually best portions of solution A and B to give two solutions which are effectively optimal. Mutation can also achieve this, but not as efficiently.

Regarding small vs. large mutations, I've always found it beneficial to have a range of mutation operators that span minimally to maximally destructive. Minimally destructive mutation corresponds to the point mutations which lead to stochastic hill-climbing. Maximally destructive mutations, I've found, help to keep the population diverse and introduce new information. Given, late in a run, a maximmaly destructive mutation operator will create a poor solution compared to the remainder of the nearly converged population.

Finally, there are indeed examples of crossover with more than 2 parents. I'll have to search a bit to find them...

Just my $0.02.

-Kirk

Rockoon1

104

December 07, 2005 12:59 PM

Quote:
Original post by Jotaf
To Rockoon, The Anonymous Poster (you should get an account BTW :)

Done.

Quote:
Original post by Jotaf
I think another reason why most people implement generations is because doing all the sorting and fitness evaluation just to find one bad individual is a kind of a waste of resources :P A programmer thinks "why not do it to 99 more individuals for free?"! You already got them sorted out.

Even in those cases where re-ranking is necessary, the arbitrary concept of 'generations' is probably not the best way to minimize the cost of that operation. :)

Also, many implimentations (problem-specific) do not require a sorted population at all. They just require the knowledge of which member is worst.

Genetic Algorithm and Sudoku

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Genetic Algorithm and Sudoku

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines