Advertisement

How can I swap sections of data in two different 2d arrays of the same length, forming a third 2d array?

Started by July 13, 2021 08:50 PM
3 comments, last by Shaarigan 3 years, 4 months ago

I am extremely new to coding and I am currently learning C# on my own. I have searched around but I cannot find what I am looking for. I may not be searching for the correct terms.

I am wondering how I could swap sections of 2 different 2d arrays, with each other, and form a new (child) 2d array.

I plan on using this in an evolution simulation, however what I am trying to do is not the same as a Genetic Algorithm. I do not care about spawning populations, fitness functions, etc.

So the basic idea is that I want to take two existing 2d arrays that are each 20 rows (genes) x 2 columns (A/B options), swap random sections, and then spawn the child 2d array of the same size.

Ideally I would like this to be similar to the Uniform Crossover in Genetics where the data isn't simply split in half. I would need the arrays to have corresponding data locations so to be used as "swap points" for the data. I also want the original "parent" arrays to remain in existence, while the child array is created.

A chromosome made of an array 20 genes, with each gene containing an array of 2 alleles.

[Chromosome]--->[Gene]--->[Allele]

Each gene represents things like color, size, and behaviors. Each gene has an A/B option (allele). These A/B options will be used later to determine "behaviors."

(Alleles that are swapped are capitalized)
Parent 1: [a,A,b,b,B,a,A,b,a,a,a,b,a,b,a,a,a,b,a,a]

Parent 2: [a,B,b,b,A,a,B,b,a,a,a,b,a,b,a,a,a,b,a,a]

Child 1: [a,A,b,b,B,a,B,b,a,a,a,b,a,b,a,a,a,b,a,a]

Notice how the data swapped at random corresponding points (loci). Also notice how a good chunk of the data remains the same. I would like to eventually add a "relatedness" check that determines what percentage the two original arrays have in common. This would be how I will choose the "parents" + their proximity to each other in the simulation.

I would then need to figure out a way to tie individual methods to each A/B for each gene in order to execute the behaviors. (ex. MovementSpeedGene - A = add Random Range (0,2) to Movement Speed)

Any and all help is much appreciated!

First, why're you suing 2D arrays? Genomes are always made of fixed pairs of bases, so Adenine, Cytosine, Guanine and Thymine. Adenine always pairs with Thymine in a double helix, the same is true for Cytosine and Guanine. So you have just 4 different states:

  1. AT
  2. TA
  3. CG
  4. GC

No need for a 2D array when you could instead use an enum.

What you now need is a simple linear for loop and a random. If both arrays have the same size, you need only one iteration indexer to access both arrays at once:

for(int i = 0; i < 20; i++)
{
    DNA fatherBlock = father[i];
    DNA motherBlock = mother[i];
}

The Random class allows you to build a switch, which will randomly select either Father or Mother as source:

Random r = new Random();
for(int i = 0; i < 20; i++)
{
    DNA dnaBlock; if(r.Next(0, 1) == 1)
    {
        dnaBlock = father[i];
    }
    else dnaBlock = mother[i];
}

And you have what you wanted. Assign the value of the chosen block into the new array at the same position is easy has writing a hello world program.

But, this may be ok for the beginning, in the real world, DNA is recombined by taking one half of the mother and the other half of the father. So to simplify this, the double helix is unrolled and one side is removed. The both helix portions of father and mother are but together and the base pairs are connecting to each other like lodestones, trying to fit the right combination of base pairs like I wrote above. So Adenine looks for a Thymine counterpart and so on. Unfitting pairs aren't connected and so removed or supplemented with the matching counterpart. This may for example look like this:

[ATTCGC]
[TAGCGT]
-----------
[ATTCGC ]
[TA GCGT]
----------
[TAAGCGT]
Advertisement

Shaarigan said:

First, why're you suing 2D arrays? Genomes are always made of fixed pairs of bases, so Adenine, Cytosine, Guanine and Thymine. Adenine always pairs with Thymine in a double helix, the same is true for Cytosine and Guanine. So you have just 4 different states:

  1. AT
  2. TA
  3. CG
  4. GC

Like I said, I am new to C# and coding generally so I am not sure if using arrays is appropriate or not. Also, I understand that genomes are made of pairs of fixed bases. I am not trying to create a biologically accurate simulation of natural selection, simply a fun experiment. So, this is the reason I have structured the “genome” system the way I did. I am more interested in simplifying the A/C/G/T to just A/B so it is easier to implement A/B behaviors. This system is intended to be the index from which certain methods are called.

No need for a 2D array when you could instead use an enum.

What you now need is a simple linear for loop and a random. If both arrays have the same size, you need only one iteration indexer to access both arrays at once:

for(int i = 0; i < 20; i++)
{
    DNA fatherBlock = father[i];
    DNA motherBlock = mother[i];
}

The Random class allows you to build a switch, which will randomly select either Father or Mother as source:

Random r = new Random();
for(int i = 0; i < 20; i++)
{
    DNA dnaBlock; if(r.Next(0, 1) == 1)
    {
        dnaBlock = father[i];
    }
    else dnaBlock = mother[i];
}

And you have what you wanted. Assign the value of the chosen block into the new array at the same position is easy has writing a hello world program.

But, this may be ok for the beginning, in the real world, DNA is recombined by taking one half of the mother and the other half of the father. So to simplify this, the double helix is unrolled and one side is removed. The both helix portions of father and mother are but together and the base pairs are connecting to each other like lodestones, trying to fit the right combination of base pairs like I wrote above. So Adenine looks for a Thymine counterpart and so on. Unfitting pairs aren't connected and so removed or supplemented with the matching counterpart. This may for example look like this:

[ATTCGC]
[TAGCGT]
-----------
[ATTCGC ]
[TA GCGT]
----------
[TAAGCGT]

I am not sure if this code would still apply if I go the A/B route as I mentioned above (if it would could you please help me understand?). I am afraid that I was misunderstood as wanting to create a realistic representation of a Genetic Algorithm, which I am not. I think you were shaping my idea in the direction of accuracy, which I really appreciate. I was just hoping to figure out the simplest way to implement the A/B idea whether that is using arrays, enums, or something else.

You need to determine what you want, genome recombination (accurate or not doesn't matter) or a way two recombine 2 dimensional arrays? Are they fixed to 2 fields each or do you consider to have N fields in the second array? This makes a huge difference and you should learn to ask for what you want, not what you have, or show some code.

Another possible solution would be an array of structs if you always have 2 fields in the 2nd dimension. So instead of

string[,] parentA; //either this or that one a line below
string[][] parentA;

you could write

struct MyGenome
{
    public string Left;
    public string Right;
}

MyGenome[] parentA;

For 2D arrays of NxM fields, you don't have any option except for using 2 for loops to select the fields

for(int i = 0; i < N; i++)
    for(int j = 0; j < M; j++)
    {
        string fatherGenome = father[i, j]; //or father[i][j] depending on how you defined the array
        string motherGenome = mother[i, j];
    }

This topic is closed to new replies.

Advertisement