Genetic Algorithms: Relative Weights and Evolution
Hey all. I am trying to solve a problem using genetic algorithms, but am running into some issues. My genes are represented as an array of 21 real numbers, which are weights, relative to one another (as in, having a simple chromosome with [0.7, 0.5] means to weight the first object by 0.7/(0.7+0.5) and the second by 0.5/(0.7+0.5) in an absolute weighting world). I run this on 345 unique 'models', with 60 models being in the population at any one time. Everytime a new population is spawned, 60 models are chosen at random.
Let me explain how I am current 'evolving' my species.
First, I have my generational aggregate constant. This is set to a random number between 1 and 10 everytime a new population is created. This value tells me how many different models to run this set of genes on before determining the value. So if the value is 4, each gene will be run on 4 different models (and therefore I will run on a total of 320 random models). For each model, I store that gene's fitness score in an array.
Next, once this is done, I find the 'surviving' genes -- those of which whose 'mean' fitness minus the standard deviation of their fitness scores. Anything above a 0 'survived' the generations.
Taking those surviving genes, I find out whether at least 15 have survived. If not, I don't even bother with checking if the simulation is complete, I start back into the loop to create a new pool.
To create a new pool, I take all surviving genes normalize their fitness scores (against each other). This normalized fitness score becomes that genes selection probability. For each surviving chromosome, I find it a partner and mate their genes. The mating process is equally random between single point cross over, double point cross over, disruption, arithmetic, heuristic and none (return identical copies of the parents). Then, I mutate using boundary (0 and 1) and random (between 0 and 1) schemes.
After doing this for all surviving genes, I fill in the rest of the population with random genes and start the whole process over, randomly selecting a new generational aggregate constant.
Now, this are failing to converge. From what I have noticed, it seems like there are very few survivors from early generations, but soon the surviving population blossoms, but at very low fitness levels. It seems that many of the 'stronger' genes simply dillute during reproduction because a weight at a specific location has a very specific meaning -- especially when it is relative to the weights around it. So this means that a 0.7 in one set of genes may only translate to a 0.4 because all the rest of the weights are 0.9. However, transfer that 0.7 into a system of 0.3s, and it will become a 0.9. See my issue? One thought is to determine the 'equivalent' relevant weights and set those values -- but how could I do this when an entire chunk is carried over at once? It would be much easier if these were absolute weighting values -- but unfortunately they are relative to one another. Should I try to develop an absolute weighting scheme -- or is there a solution?
Maybe after my relative chromosomes are developed I could just 'translate' them into absolute values before the system is run? The issue there is that everytime two genes breed, the values are no longer absolute, and once again relative, which throws their meaning off (because they are not guaranteed to add up to 1) (Just thinking out loud here)
I am debating tweaking some of my 'mating' methods to reflect this, but I would like to keep things as 'random' as possible to prevent premature conversion.
I am trying to think of ways of recognizing that a specific gene is common among survivors, and giving that gene a higher reproduction rate in that spot.
I am leaning towards specieization techniques to help with this. Any thoughts on this? Unforunately, specieization is a bit tough because it requires some 'tweaking' to determine how low your threshhold value is to determine if two genes belong to the same species pool.
Any ideas? Thanks!
[Edited by - visage on July 27, 2007 3:08:38 PM]
I am reading a paper that says it represents genes as bits, and then interprets those genes as an array of 32-bit IEEE floating point numbers. I could do this as well, but would it help with my 'relative' versus 'absolute' weighting issues.
The reason my weights must be 'relative' is because my final value must be clamped between 0 and 1 -- since all my values at any time are between 0 and 1, it means that my weights have to all be relative to each other.
The reason this is true is because I am also co-evolving the 'trigger' values, so that when a total value is computed, it is checked against the trigger value to determine if an event should occur. These events, based upon their timing, determine the fitness score.
It seems to me like my genes have far too much relative information in them.
Perhaps I could break my genes down into different chromosomes and somehow co-evolve? I dunno.
I am having trouble thinking this through. My head hurts.
The reason my weights must be 'relative' is because my final value must be clamped between 0 and 1 -- since all my values at any time are between 0 and 1, it means that my weights have to all be relative to each other.
The reason this is true is because I am also co-evolving the 'trigger' values, so that when a total value is computed, it is checked against the trigger value to determine if an event should occur. These events, based upon their timing, determine the fitness score.
It seems to me like my genes have far too much relative information in them.
Perhaps I could break my genes down into different chromosomes and somehow co-evolve? I dunno.
I am having trouble thinking this through. My head hurts.
Ah, I figured it out! I realized that if the weights were taken as absolute, the maximum value would be the total number of genes I had. Therefore, now I use the absolute values and compare versus the 'trigger value' multiplied by the number of genes I have.
Well, I can't quite give away the exact nature of my project, but I will describe it in the best way I can (because I am still having trouble).
I have a chromosome, represented as 21 genes, each represented as a real between 0 and 1. The last two genes represent two specific triggers. The other genes are used as weights, in the following way:
Total Value = (0 to 9) Sum (V(i) * W(i)) * W(17) + (10 to 16) Sum (V(i) * W(i)) * W(18)
If Total Value > Gene[19] * K, then trigger A
If Total Value < Gene[20] * K, then trigger B
Where V(i) represents some valuation measure at the current time, and W(i) represents the corresponding weight for that valuation (namely, the gene at that location in the array). K is some constant (to be discussed).
The issue I am facing is two fold -- first of all, to enable the triggers, I have to come up with some constant K. The reason I have to come up with this constant K is because genes are between 0 and 1. Therefore, genes 19 and 20 represent at what threshold values of the total possible score to trigger. Now, my issue is, what to give to K. Theoretically, I could give K the maximum value, when all weights are 1 and all valuations are 1 -- which would be 17. The issue here is that often for a specific gene, the maximum value will be much lower than that -- and the triggers never execute because genes 19 and 20 don't evolve rapidly enough.
Another choice would be to use the maximum total value, by summing up all the genes between 0 and 16. The issue with this is that then a gene's weight is relative to the weights around it. It solves the issue of genes 19 and 20, but it means that 'evolution' no longer makes sense -- when a gene is taken out of context and passed into another chromosome, it loses all of its value, because it represented part of a whole.
You see my issue? Now, one paper I read said they represented real values simply as the ones and zeroes of the 32-bit IEEE specificiation for floats -- but it doesn't seem like that will solve my relative issue. I don't think they had to deal with the same 'threshold' value issues I did.
Now, theoretically I could just choose at what levels to fire these triggers and name them as constants -- but I would prefer to evolve them if at all possible. The issue with defining a constant level is that then your weights are no longer relative and are required to be of a certain magnitude. If I decided to arbitrarily make the trigger value for trigger A at 13, it would mean that the minimum sum of all the weights would have to be at least 13 -- which seems silly to me.
Any ideas?
I have a chromosome, represented as 21 genes, each represented as a real between 0 and 1. The last two genes represent two specific triggers. The other genes are used as weights, in the following way:
Total Value = (0 to 9) Sum (V(i) * W(i)) * W(17) + (10 to 16) Sum (V(i) * W(i)) * W(18)
If Total Value > Gene[19] * K, then trigger A
If Total Value < Gene[20] * K, then trigger B
Where V(i) represents some valuation measure at the current time, and W(i) represents the corresponding weight for that valuation (namely, the gene at that location in the array). K is some constant (to be discussed).
The issue I am facing is two fold -- first of all, to enable the triggers, I have to come up with some constant K. The reason I have to come up with this constant K is because genes are between 0 and 1. Therefore, genes 19 and 20 represent at what threshold values of the total possible score to trigger. Now, my issue is, what to give to K. Theoretically, I could give K the maximum value, when all weights are 1 and all valuations are 1 -- which would be 17. The issue here is that often for a specific gene, the maximum value will be much lower than that -- and the triggers never execute because genes 19 and 20 don't evolve rapidly enough.
Another choice would be to use the maximum total value, by summing up all the genes between 0 and 16. The issue with this is that then a gene's weight is relative to the weights around it. It solves the issue of genes 19 and 20, but it means that 'evolution' no longer makes sense -- when a gene is taken out of context and passed into another chromosome, it loses all of its value, because it represented part of a whole.
You see my issue? Now, one paper I read said they represented real values simply as the ones and zeroes of the 32-bit IEEE specificiation for floats -- but it doesn't seem like that will solve my relative issue. I don't think they had to deal with the same 'threshold' value issues I did.
Now, theoretically I could just choose at what levels to fire these triggers and name them as constants -- but I would prefer to evolve them if at all possible. The issue with defining a constant level is that then your weights are no longer relative and are required to be of a certain magnitude. If I decided to arbitrarily make the trigger value for trigger A at 13, it would mean that the minimum sum of all the weights would have to be at least 13 -- which seems silly to me.
Any ideas?
Make the thresholds constants and let the genetic algorithm find the weights that work with those constants.
I did something similar. Determining a specific constant didn't seem to work too well (none of the models survived). Instead, I took the total value of the model after the weights were applied and put it through the equation E(x) = 1/(1+e^(-x)) which bound it between -1 and 1 -- and then I just let all my trigger values and weights very between -1 and 1. It should solve my problem (the thing takes a day or two to run due to the amount of data).
That description makes much more sense. Thank you.
You mention that specific weights will have a relationship to adjacent weights and this may create an issue during crossover in which a weight taken out of context may not make sense. This is related to Holland's theory of the schema in which certain "blocks" of genetic information are related and define some important aspect of the phenotype together rather than independently. I don't actually think this will be a problem for your evolutionary process, though, as long as you choose an appropriate crossover function. If you use a standard 2-point crossover, you will swamp chunks of genetic information between organisms which will preserve the vast majority of neighboring details. Only two junctions will be out of context. Using something like a uniform crossover would probably be detrimental as it preserves almost no neighborhood relationship.
The other issue you mention is the critical value for selecting a trigger. It seems to me that if you were to transform your function to give values between 0 and 1, you effectively have a probability function for each trigger. A value of 0.75 would give Trigger A 75% of the time and Trigger B 25% (assuming a uniform distribution of the function). This could be addressed in two ways. First, if you have any information that would suggest an appropriate frequency of occurence of each trigger, you could use that as your trigger value. This imposes user knowledge on the system, but it may be an acceptable influence. As Vorpy mentioned, you could just select the value and let the weights adapt to that constant.
If, on the other hand, you don't want to influence the system or you don't have an idea of the relative frequencies for each trigger, you could simply add one more parameter to your genomes - the trigger threshold value itself. This would just be another flotaing point or integer value at the end of the genome that would be subject to the same rules of crossover and mutation. You could represent it as a bit string (as you read in your IEEE paper) but I'm not convinced it would matter. If you define reasonable mutation operators and crossover opertors, including this value as an integer/float in the genome should work fine. Then you can allow the actual threshold value to evolve within your domain.
Let me know if I can clarify my answer in any way.
-Kirk
You mention that specific weights will have a relationship to adjacent weights and this may create an issue during crossover in which a weight taken out of context may not make sense. This is related to Holland's theory of the schema in which certain "blocks" of genetic information are related and define some important aspect of the phenotype together rather than independently. I don't actually think this will be a problem for your evolutionary process, though, as long as you choose an appropriate crossover function. If you use a standard 2-point crossover, you will swamp chunks of genetic information between organisms which will preserve the vast majority of neighboring details. Only two junctions will be out of context. Using something like a uniform crossover would probably be detrimental as it preserves almost no neighborhood relationship.
The other issue you mention is the critical value for selecting a trigger. It seems to me that if you were to transform your function to give values between 0 and 1, you effectively have a probability function for each trigger. A value of 0.75 would give Trigger A 75% of the time and Trigger B 25% (assuming a uniform distribution of the function). This could be addressed in two ways. First, if you have any information that would suggest an appropriate frequency of occurence of each trigger, you could use that as your trigger value. This imposes user knowledge on the system, but it may be an acceptable influence. As Vorpy mentioned, you could just select the value and let the weights adapt to that constant.
If, on the other hand, you don't want to influence the system or you don't have an idea of the relative frequencies for each trigger, you could simply add one more parameter to your genomes - the trigger threshold value itself. This would just be another flotaing point or integer value at the end of the genome that would be subject to the same rules of crossover and mutation. You could represent it as a bit string (as you read in your IEEE paper) but I'm not convinced it would matter. If you define reasonable mutation operators and crossover opertors, including this value as an integer/float in the genome should work fine. Then you can allow the actual threshold value to evolve within your domain.
Let me know if I can clarify my answer in any way.
-Kirk
To expand and improve on kirkd's suggestion, I suggest you just normalize the values to compare a number between 0 and 1 to thresholds between 0 and 1.
You need to know the minimum and maximum possible values of each V(i). Then you can easily compute from these constants and th first 19 W(i) values the minimum and maximum minX and maxX of your function
X=(0 to 9) Sum (V(i) * W(i)) * W(17) + (10 to 16) Sum (V(i) * W(i)) * W(18)
(By the way, is there a compelling reason for the redundant W(17) and W(18), which perturb the linear combination of input values you are actually using?)
Then (X-minX)/(maxX-minX) is between 0 and 1, and can be compared with W(19) and W(20) without involving any hand-fudged constant K.
A strikingly strange, unrelated aspect of your GA is randomly selecting the number of tests you run. Some generations have too few and/or others have too many: why don't you choose an appropriate value, e.g. with cross-validation experiments?
More specific of your problem is the curious limitation of comparing the same linear combination of inputs with two thresholds: this means choosing two parallel hyperplanes in a 16 dimensional vector space that partition possible inputs between the two halfspaces of those for which you do A and those for which you do B, and a slab in the middle for which you do neither action or both depending on whether W(19) is larger or smaller than W(20). If the optimal shapes of these three regions are nonplanar, you won't be able to represent them.
You need to know the minimum and maximum possible values of each V(i). Then you can easily compute from these constants and th first 19 W(i) values the minimum and maximum minX and maxX of your function
X=(0 to 9) Sum (V(i) * W(i)) * W(17) + (10 to 16) Sum (V(i) * W(i)) * W(18)
(By the way, is there a compelling reason for the redundant W(17) and W(18), which perturb the linear combination of input values you are actually using?)
Then (X-minX)/(maxX-minX) is between 0 and 1, and can be compared with W(19) and W(20) without involving any hand-fudged constant K.
A strikingly strange, unrelated aspect of your GA is randomly selecting the number of tests you run. Some generations have too few and/or others have too many: why don't you choose an appropriate value, e.g. with cross-validation experiments?
More specific of your problem is the curious limitation of comparing the same linear combination of inputs with two thresholds: this means choosing two parallel hyperplanes in a 16 dimensional vector space that partition possible inputs between the two halfspaces of those for which you do A and those for which you do B, and a slab in the middle for which you do neither action or both depending on whether W(19) is larger or smaller than W(20). If the optimal shapes of these three regions are nonplanar, you won't be able to represent them.
Omae Wa Mou Shindeiru
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement