Advertisement

Genetic Algorithm and Sudoku

Started by November 04, 2005 11:22 AM
63 comments, last by Rockoon1 18 years, 11 months ago
Quote: Original post by Timkin
Quote: Original post by RPGeezus
For cross-over to work we must satisfy some not-so-trivial criteria. I thought we were all on agreement in this regard, at least semantically.

No, I don't see agreement on this point. From my experience, crossover will always be effective on problems that are solveable by a directed random walk and crossover will perform better than said random walk.



You had said earlier that:

"your attributes should be as orthogonal as possible. This has implications on how effectively you can search the objective function surface and how effective your operators are in the GA."

and proceeded to highlight proper schema encoding.

I consider this non-trival.


Quote:
Quote: Original post by Timkin
Because crossover between two parents preserves as many schemata presently in the population as possible, while possibly forming new schemata which can be evaluated during selection.


That doesn't make any sense to me, Tim. Why would 3 parents be less effective than 2? If I cut a pie in 2 or in 10, it's still all part of the same pie. The number '2' would be suspicious to someone looking for evidence of anthropomorphic biases...


Quote: Original post by Timkin
Okay, a brief introduction to Schema Theory...
.
.
.


Sorry, I can't say I fully understand your example. The masking you describe is only masking if we use binary states.

i.e.

cat dog    bear  ratcat monkey owl   rat   


The child could be:

cat dog/monkey bear/owl rat

but never

cat turkey rabbit rat

Doesn't this satisfy "GAs preserve good quality schema"? Adding a third parent wouldn't change that.


When, earlier, I said that the optimal number of parents might be related to dimensions in solution space I was considering the following:

a = f( x, y, z);

wher x, y and z are a number between -1 and 1.

I want x, y, and z so that a = some number.

member 1 has the X part solved (magically), member 2 has the y part solved, and member 3 has the z part solved. Allowing 3 parents I could actually get the correct solution right away. Given two parents I would require at least two generations.


Quote: Original post by Timkin
Quote:
Will cross-over, without mutation, ever find a solution?

If will find the best solution available given the initial information contained in the population. See above.


I take it you mean 'no, unless.......'.


Quote: Original post by Timkin
Quote:
Will mutation, without cross-over, ever find a solution?

Possibly... and that is the best that can be said.


The same can be said of using mutation AND cross-over. I'll take your answer as 'regretfully yes'.

Quote: Original post by Timkin
Quote:
We're talking about a problem with more than one global maxima. In the hypothetical example there is more than one true global maxima.

If they are both (all) global maxima, then you cannot care which global maxima you attain,


I do care, because if we're talking about swaping components of solutions, and those componets deal with solutions located at different global maxima, then they will be incompatible. By 'will' I mean 'will likely'.


Will
------------------http://www.nentari.com
wowww i actually was thinking about making a sudoku solver program today and was just about to start writing it. Just wanted to check the active topics real quick and saw other people make programs around it too.
-Chris
Advertisement
Sorry if i'm missing something (i did skip over a lot of what i thought was irrelevant to what i'm about to ask) but if you MAKE the puzzle by starting with a completed one and taking away some values, what prevents you from creating a puzzle with more than oen solution? Surely there has to be a certain way you have to make them blank, right? I havent tried this yet so I don't know if it's true, it just seems like a possibility to me.
-Chris
Quote: Original post by RPGeezus
You had said earlier that:

"your attributes should be as orthogonal as possible."


Again you're not taking that in the context of what I wrote. Any state space can be mapped into an orthogonal space (or locally orthogonal space) prior to encoding... and in these spaces, the rate of convergence of the GA will be higher (so crossover could be said to be more effective).


Quote:
Sorry, I can't say I fully understand your example. The masking you describe is only masking if we use binary states.


...a common misconception regarding GAs. Any string made up of characters of cardinality >2 can be mapped into a string with characters of cardinality 2. That is, I can trivially represent the gene 'cat' and its alleles ('tabby','moggy' and 'tom', for example ;) ) using the binary sequences '10','01' and '11' (or any other sequence I choose). Any set of tokens that you wish to use for your domain can be mapped into binary strings and thus I can still discuss the merits of the search algorithm applied to your example, but do so in the space of binary encoding strings.


Quote:
Doesn't this satisfy "GAs preserve good quality schema"? Adding a third parent wouldn't change that.


Yes, it would. Consider the schema you get if the third parent you select for the tuple is 'goat turkey fox elephant'. What are the set of offspring now? And what values of the objective function would be found in that set? The variance of the objective function over the offspring set is a measure of the quality of the schema represented in the parents. The larger the offspring set, the more likely a higher variance (and vice versa).

Quote: When, earlier, I said that the optimal number of parents might be related to dimensions in solution space I was considering the following:

a = f( x, y, z);

wher x, y and z are a number between -1 and 1.

I want x, y, and z so that a = some number.

member 1 has the X part solved (magically), member 2 has the y part solved, and member 3 has the z part solved. Allowing 3 parents I could actually get the correct solution right away.

That's simply not true, since you can make no assumptions about whether the correct (solved) part of each parent chromosome will end up in the offspring. Furthermore, you're assuming that you are performing multi-site crossover, which is not a canonical operator.

Quote: I take it you mean 'no, unless.......'.

No, that's not what I'm saying. What I said was that crossover will find the optimal solution represented by the initial population. To borrow from your example earlier... if your initial population didnt contain the allele 'hound' for the dog gene, but this was the value of the allele in the global maxima, then obviously crossover cannot find that global maxima. However, it will allow you to find the string of highest fitness that IS contained within the set defined by the combinations of alleles in the initial population. Make sense?

Quote:
The same can be said of using mutation AND cross-over. I'll take your answer as 'regretfully yes'.


You're doing it again! No, thats NOT what I said. In the absence of selection pressure, neither method is anthing better than randomly generating guesses at the answer. With the above caveat regarding crossover, they both are unlikely to find the global maxima. If we permit selection pressure, then mutation is still just randomly generating solutions with no regard to the solutions already tried. Crossover however makes use of the fact that you've already done testing and work in previous generations.

Quote:
I do care, because if we're talking about swaping components of solutions, and those componets deal with solutions located at different global maxima, then they will be incompatible. By 'will' I mean 'will likely'.

They're not incompatible at all, just less 'fit', in which case they'll be discarded through selection, just as they would be if the mutation were unfit. No one ever claimed that a GA (and in particular crossover) only ever produced candidate solutions at each generation that had higher fitness than the previous generation. The only claim made by a GA is that the expected fitness of the population at each iteration is monotonically increasing.


Cheers,

Timkin
Quote: Original post by Timkin
and in these spaces, the rate of convergence of the GA will be higher (so crossover could be said to be more effective).


If I understand correctly, the degree to which the encoding is orthagonal is proportional to the effectiveness of cross-over.

Quote:
...a common misconception regarding GAs. Any string made up of characters of cardinality >2 can be mapped into a string with characters of cardinality 2. That is, I can trivially represent the gene 'cat' and its alleles ('tabby','moggy' and 'tom', for example ;) ) using the binary sequences '10','01' and '11' (or any other sequence I choose). Any set of tokens that you wish to use for your domain can be mapped into binary strings and thus I can still discuss the merits of the search algorithm applied to your example, but do so in the space of binary encoding strings.


I don't see it... Treating my operands like binary strings at a binary level changes the entire behaviour.

cat dog rat .... octopus
1011 1000 0100 .... 1111

If I say cat/rat, I'm not saying all values between 0100 and 1011. I'm not saying 0/1 0/1 0/1 0/1... I am specifically saying nibble 1011 _or_ nibble 0100, not a combination thereof.

If my encoding is specific, as in cat / dog / rat / octopus then I litereally mean I want crossover to chose ONE OF but not a mixture of or a range of values in between. The high-level representation serves to eliminate options. Just representing it as binary, and allowing the GA to work with the binary, does not let us eliminate options.



Quote: Original post by Timkin
Quote:
Doesn't this satisfy "GAs preserve good quality schema"? Adding a third parent wouldn't change that.


Yes, it would. Consider the schema you get if the third parent you select for the tuple is 'goat turkey fox elephant'. What are the set of offspring now? And what values of the objective function would be found in that set? The variance of the objective function over the offspring set is a measure of the quality of the schema represented in the parents.

The larger the offspring set, the more likely a higher variance (and vice versa).


I still do not see how this changes the schema. All of the same operands exist, in the same order they appeared in the original three parents, but in a different combination.



Quote: Original post by Timkin
Quote: When, earlier, I said that the optimal number of parents might be related to dimensions in solution space I was considering the following:

a = f( x, y, z);
.
.
.



That's simply not true, since you can make no assumptions about whether the correct (solved) part of each parent chromosome will end up in the offspring.


This is no different than with two parents. We can make no assumptions about whether the correct (solved) part of each parent chromosome will end up in the offspring.

Quote: Original post by Timkin
Furthermore, you're assuming that you are performing multi-site crossover, which is not a canonical operator.

I don't know what you mean by this.

Quote: Original post by Timkin
Quote: I take it you mean 'no, unless.......'.

No, that's not what I'm saying. [...] it will allow you to find the string of highest fitness that IS contained within the set defined by the combinations of alleles in the initial population. Make sense?


That's what I understood the first time. I was trying to be funny, but I guess it didn't come across well via text. :)

We'll get the best solution possible given the current operands that exist. If we're missing an operand then...... At which point crossover will just be working to randomize the order of the operands.


Quote: Original post by Timkin
Quote:
The same can be said of using mutation AND cross-over. I'll take your answer as 'regretfully yes'.


In the absence of selection pressure ....


What, exactly, do you mean by this? We always have selection pressure-- the fit stay and multiply.


Quote: Original post by Timkin
they both are unlikely to find the global maxima. If we permit selection pressure, then mutation is still just randomly generating solutions with no regard to the solutions already tried.


With mutation, a child is still mostly it's parent. Doesn't this mean that it is an attempt to refine an already tried solution?

If we permit elitism (which is a pretty common practice I think) then we establish a nice baseline for testing out these new changes.


Quote: Original post by Timkin
Quote:
I do care, because if we're talking about swaping components of solutions, and those componets deal with solutions located at different global maxima, then they will be incompatible. By 'will' I mean 'will likely'.

They're not incompatible at all, just less 'fit', in which case they'll be discarded through selection, just as they would be if the mutation were unfit. No one ever claimed that a GA (and in particular crossover) only ever produced candidate solutions at each generation that had higher fitness than the previous generation. The only claim made by a GA is that the expected fitness of the population at each iteration is monotonically increasing.


No problem-- maybe I'm looking at it from a different perspective. You say less fit, I say not compatible, which means less fit to the evaluation function.

Crossing over a member at peak A with a member at peak B is very unlikely to produce a good candidate, as they are both set on very different goals. They have a high likelyhood of being incompatible. Cross a BMP with an EXE and you'll likely get neither a pretty picture or a working application. This is all I'm saying.

If the members were clustered around the same peak then I could see crossover working well.

Will
------------------http://www.nentari.com
Quote: Original post by RPGeezus
If I understand correctly, the degree to which the encoding is orthagonal is proportional to the effectiveness of cross-over.

As a general rule, I would say yes... but I'm not certain that it's an absolute.

Quote:
I don't see it... Treating my operands like binary strings at a binary level changes the entire behaviour.


Only if you fail to encode correctly, so that there is a 1-1 mapping from your 'label' encoding to the binary encoding. 'cat' and 'dog' are just labels, as are '00' and '010'. So long as there are no binary alleles without corresponding 'string' alleles, then which I use is irrelevant.

Quote: If my encoding is specific, as in cat / dog / rat / octopus then I litereally mean I want crossover to chose ONE OF but not a mixture of or a range of values in between.


...and obviously crossover can be chosen only to apply at gene boundaries. If you crossover between boundaries (and meet the stipulation of a 1-1 mapping), then crossover is performing selective mutation (ie. its mapping the 'split' allele to another allele (depending on the other part of the allele string it receives from the parent) as well as crossing the tails of the two chromosomes. This is where your confusion lies. It's difficult to imagine this working if my encoding is 'tabby', 'moggy' and 'tom', but not difficult if my encoding is '00', '01', etc.

The high-level representation serves to eliminate options. Just representing it as binary, and allowing the GA to work with the binary, does not let us eliminate options.



Quote:
I still do not see how this changes the schema. All of the same operands exist, in the same order they appeared in the original three parents, but in a different combination.


Remember that the schema represents the possible set of allele values for that set of chromosomes. So, if each of the parents has the same allele for the gene 'cat', then the schema will have that allele. If any one parent has a different allele than the others, then the schema will have a wildcard. The larger the set of chromosomes considered, the more likely that the schema representing that set will have more wildcards. Subsequently, my comments about the type of offspring that set will produce stand. Indeed, this is the whole point of GAs. As the population nears convergence, the population schema is refined (the number of wildcards are reduced) until, at convergence, the only possible offspring (ignoring mutation during reproduction) is a copy of the parents.

Obviously, all pairwise schema still exist if you have 3 parents... but if you have 3 parents, then pairwise schema don't count, because presumably you're using all 3 parents in reproduction at the same time (the GA-orgy!). Otherwise, you're just doing pairwise reproduction. See my point?


Quote:
This is no different than with two parents. We can make no assumptions about whether the correct (solved) part of each parent chromosome will end up in the offspring.

Absolutely. But the probability of it happening is higher. (1/2 vs 1/3).

Quote: Crossing over a member at peak A with a member at peak B is very unlikely to produce a good candidate, as they are both set on very different goals.


I don't think that's absolutely true. In certain problems that have multiple global maxima (for example, those where any 'solution' is a global maxima, such as N-queens), there may indeed be properties common to every solution. Thus, crossing over any two solutions may preserve those properties. I can't back this up with hard evidence, but Terry Kwok's work seems to suggest that, at least for problems like N-queens, there is a sort of subspace made up of all solutions that his algorithm can work on. Certainly, the rate of finding solutions is proportional to the number already found. Admittedly his isn't a GA approach, but it may be the case that appropriate encodings exist for GAs for such problems where this 'subspace' property can be exploited. Maybe not though.


Cheers,

Timkin
Advertisement
Quote: Original post by Timkin
So, now take our original 3 candidate solutions as a triplet. The schema resulting from this tuple is '###1####0', which, as you can see, is the worst (most general) schema from the pairwise set. Candidate solutions generated by producing offspring from a 3 way crossover would be randomly selected from this set and thus, would randomly select one of the possible fitness values spanned by this set. The probability of producing good quality offspring would be less than if pairwise crossover were performed.


Ok, let me just point out one tiny little detail :)

You're assuming that a # means there's a 50% chance of occuring a 1 (P(1)=50%) and 50% of a 0 (P(0)=50%). This is how you arrived at your conclusion: information is lost. The thing is, if you're chosing from 3 parents, and for example one of them has a 0 and two of them have 1s, the probabilities are actually: P(0)=33% and P(1)=67%. So it still preserves the information you said was lost. It's still arguable whether 3 parents are better or worse than 2.
(I hate paralel discussions in the same thread but what the heck)

Quote: Original post by giveblood
Sorry if i'm missing something (i did skip over a lot of what i thought was irrelevant to what i'm about to ask) but if you MAKE the puzzle by starting with a completed one and taking away some values, what prevents you from creating a puzzle with more than oen solution? Surely there has to be a certain way you have to make them blank, right? I havent tried this yet so I don't know if it's true, it just seems like a possibility to me.


I don't think you have to do that. If there's another way to solve the puzzle and you found it, kudos to you :P It's a good thing in my opinion and I don't think no one will argue that a Sudoku puzzle generator will *have* to ensure there's only one solution. My proposed algorythm is simple and neat (and it's not mine :P ) and I'd be surprised if it's not the most common out there.

All of this is of course in my humble opinion, but I figure I will give my two cents anyways.

I have quite a lot of experience with GA's going back into the late 80's. I guess the darwinian romance has worn off.

One of my biggest pet peves is the reliance on 'real world' visualizations of what GA's are supposedly implimenting, and then blindly implimenting them. A Genetic Algorithm is nothing more than a generalized directed stochastic search.

What defines a GA is how you move from one population to another.

Population(t+1) = GA(Population(t))

This GA() function must satisfy a few various well known properties, but none of them has the notion of a 'generation' in it. The DNA is not a description of an individual. It is a piece of information that can be used in a directed search.

Keep thinking 'directed search.'

Why exactly should we divide the creation and destruction of this information into large scale changes? After producing a new piece of information, don't you want to use it immediately?

Most people seem to be doing:

10 Produce a lot of new information from the set of retained information
20 Merge it all into the set of retained information
30 Goto 10

While I advocate:

10 Produce a new piece of information from the set of retained information.
20 Merge it into the set of retained information
30 Goto 10

Or in GA speak:

10 Produce a new member of the population using crossover and/or mutation.
20 Merge Sort the member into the population, throwing out the worst member.
30 Goto 10

The generation method with regard to algorithm implimentation is a needless complication at best (can anyone justify it?) and detrimental to efficiency at worst.


As far as crossover goes.

Crossover can be completely useless or extremely valuable. There seems to be no middle ground here. This depends on the specific problem set. In general, when crossover isnt a valuable commodity, you should not be using a genetic algorithm.

What does crossover do?

Technically: Crossover takes two strings of information and attempts to form a union between them, in the off chance that a more correct string of information can be derived from such unions.

In those cases where crossover isnt doing its job (and there happens to be work to be done), it is because the breeding population isnt diverse enough. Population diversity is critical in all areas of GA's. Keep thinking 'directed search.'

Without diversity and crossover, all you have is a stochastic hill climber.

I have been measuring diversity using the mean squared difference between the DNA in the breeding population.

When the diversity approaches zero, I know that its time to stop running the genetic algorithm.

I don't know if the best solution has been found or not, but I know its now in a stochastic hill climb and I know *for certain* that there are algorithms that perform a lot better than a stochastic hill climb that I can seed with the information attained thus far.

- Rockoon (Joseph Koss)
Quote: Original post by Jotaf
You're assuming that a # means there's a 50% chance of occuring a 1 (P(1)=50%) and 50% of a 0 (P(0)=50%).


No, I'm not assuming that. I'm simply saying that '#' is a wildcard, meaning that any gene depicted with that allele in the schema can take on any of its allele values in the chromosome. As to the probability of each allele value, that obviously depends on the frequency of specific allele values in the (sub)population that the schema represents (so I'm not assuming 2 or even 3 members... I'm not assuming any number at all).

Quote: This is how you arrived at your conclusion: information is lost.

No, it isn't, since you were wrong about my first statements. A schema doesn't represent 'information lost'.

Timkin

This topic is closed to new replies.

Advertisement