Question about genetic algorithms

JimPrice · 2004-09-22T11:44:38

Hi, Continuing my reading on AI, I've come across genetic programming. As I understand it, it works as follows: a} Loop over n generations. b) Each generation has m members. Each member of a generation has a single realisation of each of a number of characteristics. c) Calculate a fitness function for each member; keep the top few members and evolve the next generation from pairs drawn from this set. Also allow for the possibility of mutation, to keep from finding local features of the fitness function response surface. The final 'best' members of the final generation provide our genetically superior, trained, set. I would assume that if you wanted an 'easier' set, you could just draw realisations from an earlier generation. This seems very similar to problems that are encountered in statistical analysis very frequenty (ie multi-dimensional characteristic set, evaluable response). In statistical analysis, I would solve this problem using a response-surface analysis design (linear model or generalised additive model, built using splines). Essentially you generate members with given charateristics and use them to build up a mathematical profile of the fitness function, allowing characterisation both of where the 'optimal' region is, but also how changing individual (or sets of) characteristics changes the fitness function. At this stage (and admittedly, given my still limited knowledge of the AI field), I can't see the advantages that a genetic algorithm gives - what am I missing? Is anyone aware of any literature comparing techniques? Would anyone be interested in results from this if I played with it for a bit? Regards, Jim.

Artificial Intelligence Programming

Started by JimPrice September 03, 2004 08:28 PM

23 comments, last by Nathaniel Hammen 20 years, 2 months ago

Woodsman

426

September 15, 2004 09:09 PM

Just to let you know, most of my advice came from Koza's Genetic Programming.

Quote: Original post by Nathaniel Hammen
Quote: Original post by Woodsman
Look up 'editing'. ...

I already thought of doing this. Looking up previous ways of doing this will probably be useful. I'm a little leery of just destroying a "gene" that might be useful in a future generation. However, I also don't want to have my programs take up too much room. I'll look it up and see if anyone has a good solution to this. Off of the top of my head, I can only think of randomly deciding whether to do this or not; possibly, this random would be weighted to occur more often in larger trees.

Yeah, I meant it primarily in terms of the final result. It may be interesting to see the results of doing this at breeding time.

If a plant cannot live according to its nature, it dies; so a man.

Anonymous

September 17, 2004 04:42 PM

Quote: Original post by Nathaniel Hammen
I have programmed my own Genetic Program. To test it, I had it solve the distance between two points. On my first trial it got it by the tenth generation. On my second trial, I had it run for 224 generation before I decided to quit it. It was still off by more than 1,000. It also got slower and slower as the program trees evolved more and more complexity to try to solve such a simple problem.

On my third trial it got it on the 31st generation. I looked at the code that the third trial generated and it was essentially If(false&&a whole lot of shit , more shit , more if statements that eventually led to the algorithm). So these if statements were sort of like dominant and recessive genes. I guess thats a good thing, because some negative genes can carry over until they're needed, but it makes it so that there is a lot of code just to get sqrt((L-O)dot(L-O)).

I'm also concerned about the results I got on the second trial. It just couldn't come up with the answer. Is there any way I can prevent this, or do I need to check the program every once in a while, just in case. I was thinking that maybe I should introduce a brand new member into the population every once in a while, but this would probably make my programs worse.

Run in two stages.

First stage: find a valid solution using full featured GP

Second stage: find shortest equivilent by using only "crop" operators during crossover/mutation.

Stage 2 only garantees that irrelevant parts of the program tree are removed

Vlion

151

September 20, 2004 11:52 PM

Look up academic papers by Terry Soule if you are concerned with the if(false && false) type bloat. He studies that stuff- phd dissertation, etc. Look for it on Citeseer.

The main problem I have with genetic tecniques is that it is (a) solution, not the solution, nor the solution within $epsilon of accuracy. This becomes important when you are working with functions you have no idea of how they are minima-, optima- continuousness-wise.

But yeah.

Look for GECCO papers- its a big GA/GP conference.

~V'lionBugle4d

kwatz

122

September 21, 2004 01:01 PM

Quote: Original post by Nathaniel Hammen
The mathematical formula was only for a test to see whether my GP worked, and could evolve the answer. After I work out all of the kinks, I'm going to run my GP on a little battle arena type thing...

Gotcha.

Quote: Original post by Nathaniel Hammen
Tree Selection
I have a struct called ReproductionRate that I plug numbers into at the beginning of running the program. This decides how many trees will be copied, how many will be crossed over, and how many will be mutated. The ToCopy amount of trees that had the best fitness score are copied to the next generation. To select the tree that will be mutated, I pick a ToMutat number of trees randomly from all of the trees except for the ToCopy amount of trees that had the lowest fitness score. For crossing over, I use each of the best ToCross amount of trees exactly once, and select which ones are to be crossed over with which ones randomly.

Can you give these numbers in terms relative to the population size? For instance, these values are often expressed as percentages; that is, on average, X% of the total population is mutated each generation, or Y% are candidates for crossover. Expressing them this way may help tell whether they may be too high or too low.

Quote: Original post by Nathaniel Hammen
Node Selection
For both Cross Over and Mutation, I select which node to use the operation on completely randomly. I plan to change this in the future, because there are obviously more nodes near the bottom of the tree, which makes the likelyhood of a specific node being selected increases as the depth increases. This makes it so that alot of times, the node selected is on the bottom row, or just above that, which means that only three or so nodes are being crossed over or mutated.

Regarding mutation, keep in mind that it generally best serves its purpose when it has small, incremental effects. (Note "generally" - this isn't always the case) So it may not be bad thing that it tends to choose nodes with relatively few children. Changing these nodes will have fairly small effects, while changing nodes higher up in the tree will have broader effects.
On the other hand, you may or may not want the same result with crossover. While I'm not convinced that your current method is better or worse, one alternative could be to choose a level in the tree each with equal probability, then choose a node from within that level. This has the opposite problem, in a way - the root node has a much greater probability of being chosen than any one of the leaf nodes. Just something to think about.

Quote: Original post by Nathaniel Hammen
Mutation
For mutation, I delete the selected node and just build random nodes as I do when creating the tree in the first place.

Something to consider that draws on my note above would be to change only the node itself instead of removing the entire subtree and replacing it. This would require the new node to take the same number of arguments as the old node, but it's arguably a smaller change.

Quote: Original post by Nathaniel Hammen
A book... Paying money... Probably not. But, maybe. After all, I did buy "AI Game Programming Wisdom".

If you're attending college, you should look into journal article availability. Other alternatives are memberships to the ACM or IEEE - both offer digital access to certain publications. It'll be harder to find tutorial-like information, but it'll also have a broader wealth of topics.

Are you able to characterize the problem you're having with your results at all? For example, is the population converging to a bad solution, or is it running for many, many generations and never producing a good solution, or perhaps some other problem?

Nathaniel Hammen

136

September 22, 2004 11:44 AM

Quote: Original post by Vlion
Look up academic papers by Terry Soule if you are concerned with the if(false && false) type bloat. He studies that stuff- phd dissertation, etc. Look for it on Citeseer.

The main problem I have with genetic tecniques is that it is (a) solution, not the solution, nor the solution within $epsilon of accuracy. This becomes important when you are working with functions you have no idea of how they are minima-, optima- continuousness-wise.

But yeah.

Look for GECCO papers- its a big GA/GP conference.

I'll look his stuff up.

Quote: Original post by kwatz
Quote: Original post by Nathaniel Hammen
Tree Selection
I have a struct called ReproductionRate that I plug numbers into at the beginning of running the program. This decides how many trees will be copied, how many will be crossed over, and how many will be mutated. The ToCopy amount of trees that had the best fitness score are copied to the next generation. To select the tree that will be mutated, I pick a ToMutat number of trees randomly from all of the trees except for the ToCopy amount of trees that had the lowest fitness score. For crossing over, I use each of the best ToCross amount of trees exactly once, and select which ones are to be crossed over with which ones randomly.

Can you give these numbers in terms relative to the population size? For instance, these values are often expressed as percentages; that is, on average, X% of the total population is mutated each generation, or Y% are candidates for crossover. Expressing them this way may help tell whether they may be too high or too low.

Well, as I said, I can input different numbers for these values at run-time. Usually I use 10 for Copy, 80 for Cross, and 10 for Mutat.

Quote: Original post by kwatz
Quote: Original post by Nathaniel Hammen
Node Selection
For both Cross Over and Mutation, I select which node to use the operation on completely randomly. I plan to change this in the future, because there are obviously more nodes near the bottom of the tree, which makes the likelyhood of a specific node being selected increases as the depth increases. This makes it so that alot of times, the node selected is on the bottom row, or just above that, which means that only three or so nodes are being crossed over or mutated.

Regarding mutation, keep in mind that it generally best serves its purpose when it has small, incremental effects. (Note "generally" - this isn't always the case) So it may not be bad thing that it tends to choose nodes with relatively few children. Changing these nodes will have fairly small effects, while changing nodes higher up in the tree will have broader effects.
On the other hand, you may or may not want the same result with crossover. While I'm not convinced that your current method is better or worse, one alternative could be to choose a level in the tree each with equal probability, then choose a node from within that level. This has the opposite problem, in a way - the root node has a much greater probability of being chosen than any one of the leaf nodes. Just something to think about.

Well then, I'll use the same method I am now for Mutation, but for Cross-Over, I'll use some sort of curve, so that it will be more likely to select nodes at a middle depth.

Quote: Original post by kwatz
Quote: Original post by Nathaniel Hammen
Mutation
For mutation, I delete the selected node and just build random nodes as I do when creating the tree in the first place.

Something to consider that draws on my note above would be to change only the node itself instead of removing the entire subtree and replacing it. This would require the new node to take the same number of arguments as the old node, but it's arguably a smaller change.

For some functions, such as an if statement, that function is the only one with the same number and types of inputs... No change would occur.

Quote: Original post by kwatz
Are you able to characterize the problem you're having with your results at all? For example, is the population converging to a bad solution, or is it running for many, many generations and never producing a good solution, or perhaps some other problem?

Hmmm... I never tried to characterize it. All I know is that it always either got the answer in under 50 generations, or would never come up with anything useful. However, the never coming up with anything was pretty rare; it's just that, when it did, it ran for a longer time, before I decided to make it give up.

I am the master of ideas.....If only I could write them down...

Question about genetic algorithms

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Question about genetic algorithms

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines