Advertisement

Anyone know the probability density function for this?

Started by August 08, 2013 12:55 AM
18 comments, last by alvaro 11 years, 6 months ago

Hey there 48K ;) (EDIT: And welcome to gamedev, it's a great site for getting your maths/programming questions answered in a friendly way. You may want to change your email preferences though so you don't get spammed with "X posted a thread about..." emails).

I'm sure MATLAB will allow you to do statistical simulations via programming but haven't got any experience of using it.

On WoS I recommended Python, that's a LOT easier than line-number era BASIC and has some really nice features for doing maths/stats stuff (like built in vectors, lists, dictionaries, support for bignums [arbitrarily large numbers], etc.).

Java is another option but that is going to be more daunting to start off with if you're not used to compile/link style programs and is frankly a beast.

Python homepage:

http://www.python.org/

EDIT: You should probably go into more detail about what you are trying to achieve as well, something to do with simulating splitting DNA or RNA strands I believe?

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

Thanks PS. I've actually just spent a bit of time Googling for tutorials/examples for R, and I've now modified them and written my first "program". A pair of nested recursive loops.

x <- 1:10
z <- NULL
for(j in seq(along=x)) {
for(i in seq(along=x)) {
z <- c(z, x * x[j])
}
}
x
z
plot (z,z)
Quite pleased actually by how simple it was to tweak a previous program (containing a single loop).
So, I may actually be more on track than I first thought!
The syntax is very different than what I am used to with BASIC, but the concepts seem to be retained - I guess it is all programming after all.
I may take a look at Python too - thanks for the heads up! - I need to reply over on WoS too.
Re: the question: Yes - I'm interested in modelling distributions of breaks/cuts in a linear element (in my case, chromosomes) - varying parameters, like frequency, and how the presence of one break/cut might affect the position of additional cuts.
Advertisement

When posting code, use the <> icon in the toolbar above the posting box, or [ code ] [ /code ] (without the spaces) to retain the formatting, like so:

EDIT: However, that seems to eat any text after the closing [ /code ] tag, oh deary me gamedev ;)


x <- 1:10
z <- NULL
for(j in seq(along=x)) { 
    for(i in seq(along=x)) { 
        z <- c(z, x[i] * x[j])   
    } 
}
x
z
plot (z,z)
"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

Thanks for that. I'm stuck already though. I can get data into R using the read.table function. But I cannot seem to do the necessary arithmetic on the data in the resulting vector. R seems to treat the data held in the imported vector differently than in a vector I assign directly.

For example, in the program listed above, if I read a txt file to populate x, then run. instead of multiplying each vale of x by every other value, iteratively, and putting the result in z, it instead, only multiplies each element of x by itself, resulting in a z vector the same length as as x, rather than x * x in size.

I'm not sure I am being clear...basically given a contents of x of 1:3 (but populated by a text import) the resulting z vector is just: 1, 4, 9

When it should be: 1,2,3, 2,4,6, 3,6,9

Are you familiar with R?

Nope, I don't know my R's from my elbow I'm afraid. Maybe ask in the "General Programming" forum, or someone who is familiar with it may see this thread?

I see it is a statistical language though so there is a chance someone reading this thread may be familiar with it.

I googled R language and saw it is also known as GNU S, make your mind up ;)

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

I don't know much about R, but it is used at work and I should probably learn it. If you can post the exact commands that give you the undesired 1, 4, 9 result, I'll take a look.

Advertisement


In light of this, I have a question to Bacterius (or any of you): What software did you use to write/run the program/model you presented in post #2?

I just whipped up a C program providing me with the PDF data in list form, and plotted it with Mathematica. It would've probably been faster to do it in MATLAB or R, or even Mathematica.. if you know it in advance, I didn't tongue.png (though I should probably learn)

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

I normally use Perl for this type of quick thing:


for (1..5000000) {
  $x = min(min(rand(),rand()),rand());
  $count[int($x*500)]++;
}

for $i (0..499) {
  print "$i ".(0+$count[$i])."\n"
}

sub min {
  $x=shift;
  $y=shift;
  return $x>$y?$y:$x;
}

I normally use Perl for this type of quick thing:


for (1..5000000) {
  $x = min(min(rand(),rand()),rand());
  $count[int($x*500)]++;
}

for $i (0..499) {
  print "$i ".(0+$count[$i])."\n"
}

sub min {
  $x=shift;
  $y=shift;
  return $x>$y?$y:$x;
}

I actually modelled the whole "cut a piece of string" aspect instead of going directly to a minimum of three uniform random variables because I didn't know they were equivalent, so the code was a bit longer. But yeah, it was pretty much the same otherwise - binning each sample and then printing out how many samples fell into each bin.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”


I actually modelled the whole "cut a piece of string" aspect instead of going directly to a minimum of three uniform random variables because I didn't know they were equivalent, so the code was a bit longer.

Here's a quick way to see the equivalence: Instead of making N cuts in the string, start with a circle and make N+1 cuts in it. Wherever you make the first cut in the circle doesn't make any difference, and after you have made it you can take coordinates where the cut happened at 0 and the length of the circle is 1. So the two procedures will result in the same probability distribution for lengths, but thinking of the circle makes it much easier to see that the lengths of all the pieces come from the same distribution. Well, at least that's what convinced me.

This topic is closed to new replies.

Advertisement