Advertisement

How much of a performance hit do recursive functions have?

Started by November 11, 2000 01:06 PM
79 comments, last by cearny 24 years, 2 months ago
To bishop_pass about your own stack maintainment. A stack is such an easy datastructure, that you can make all functions inline, eliminating the extra function call. But I don''t think you will gain to much, after all you have to add a few extra instructions.
Another sollution I just thought of is using inline asm for stack maintaiment, that should be pretty fast.
I propose this problem to be of equal difficulty but simple enough to implement for testing:

int ray_trace (RAY *ray, int depth){   	RAY rays[64];	int final_color = 0;	if (depth < MAX_DEPTH)	{		n = rand() % 64;		for (r = 0; r < n; r++) 		{			final_color += ray_trace(&ray[r], depth + 1);		}		return (final_color / n);	}	else return rand() % 256;}    


I realize it's not the same problem.
I'm just trying capture what would be hard about doing this iteratively, not clutter it with extra stuff. If you agree this captures the essence of the problem, I think I can generate the iterative algorithm (without a stack ADT) we could comment on. (Although I haven't a clue which will be faster, but it will sure be interesting.)

Edited by - WMiller on November 13, 2000 11:38:11 PM
Advertisement
quote: Original post by WMiller

Is this problem of the same difficulty as the one posted above:

int ray_trace (RAY *ray, int depth){   	RAY rays[64];	int final_color = 0;	if (depth < MAX_DEPTH)	{		n = rand() % 64;		for (r = 0; r < n; r++) 		{			final_color += ray_trace(&ray[r], depth + 1);		}		return (final_color / n);	}	else return rand() % 256;}   


I realize it''s not the same problem.
The point is that we''re trying to capture what would be hard about doing this iteratively, not clutter it up with a bunch of function calls. If you feel this captures the essence of the problem, I will produce an iterative solution that does not use a stack. (Although I haven''t a clue which will be faster, but it will sure be interesting.)


It appears to capture the essence of the problem. To be realistic, expect the RAY structure to be between 50 to 100 bytes.



_______________________________
"To understand the horse you'll find that you're going to be working on yourself. The horse will give you the answers and he will question you to see if you are sure or not."
- Ray Hunt, in Think Harmony With Horses
ALU - SHRDLU - WORDNET - CYC - SWALE - AM - CD - J.M. - K.S. | CAA - BCHA - AQHA - APHA - R.H. - T.D. | 395 - SPS - GORDIE - SCMA - R.M. - G.R. - V.C. - C.F.
Recursion can be implemented as an "iterative" loop, and with a stack. However, you''re still simulating recursion by doing that stack stuff. You''re doing the same thing!! What you''re complaining about is function calls vs. loops and stacks. In this case, the speed depends greatly on the compiler. Either way, you''re probably still implementing the same divide-and-conquer algorithm.


"Whoever performs only his duty is not doing his duty." --Bahya ibn Pakuda
"If a man does not keep pace with his companions, perhaps it is because he hears a different drummer. Let him step to the music he hears, however measured or far away"--Henry David Thoreau
It may take me a little while to come up with the C++ implementation, but I think the basic gist will go like this:

Use an n-ary tree post order traversal. This will look just like the iterative binary tree solution on my website, except that instead of UPLEFT, UPRIGHT, and DOWN there is UP0, UP1.. UP63, and DOWN.

As before, the data structure needs to be modified. A child must know which child he is, and who his parent is (otherwise I believe the problem impossible). Since the number of children is not fixed, the parent must know how many children it has as well.

I think our model will have to be changed (I know little about ray-tracing, so I''m not sure what''s agreeable)

n = rand() % 64 means an average of 32 children per node. At a depth of 5, this is enormous. Depending on what depth we are talking about, I think a single test might take hours. I wonder if anyone can suggest a better algorithm for determining n that''s reasonable. Possibly just use a number lower than 64?

Using random numbers might not work, even with a common random seed. If the order of "rolling the dice" is not precisely the same in both algorithms (particularly relevant to determining the number of children), two totally different results will get calculated -- and that''s not good.
quote: Original post by WMiller

n = rand() % 64 means an average of 32 children per node. At a depth of 5, this is enormous. Depending on what depth we are talking about, I think a single test might take hours. I wonder if anyone can suggest a better algorithm for determining n that's reasonable. Possibly just use a number lower than 64?


NOTE: I've edited this several times. It should be correct now.

A ray's contribution to the final value is diminished the deeper it is in the tree. That is why I suggested that the create_ray_distribution() function modulate the number of children based on depth.

A simplified version might be like this:
n = rand () % 64 / depth^3;
where n is a minimum of 1.

However, this is not really the way it should be done. There could be a case where the first ray hits a perfect mirror. So we only send out one child ray. That ray hits a perfect mirror. So we send out one grandchild ray. That ray hits a perfect mirror. So we send out one great-grandchild ray. And that ray hits a diffuse surface so we send out 64 great-great-grandchild rays.

So really, a better way to calculate the number of child rays would be:
n = (1.0 - object->reflectivity) * weight * 64;
reflectivity ranges from 0.0 to 1.0.
0.0 = perfect diffuser.
1.0 = perfect mirror.

weight = weight * object->reflectivity and is updated after n is calculated and before we pass weight on to our children. The root ray has a weight equal to 1.0 and passes object->reflectivity of the object it hit as weight to its children.

This way a ray bouncing from mirror to mirror sends its importance to its children, but a diffuse ray 'diffuses' its importance.

So randomize the object reflectivity from 0.0 to 1.0 and you have all the info you need to make the ray tree more tractable.



Edited by - bishop_pass on November 14, 2000 1:34:53 AM

Edited by - bishop_pass on November 14, 2000 1:36:58 AM

Edited by - bishop_pass on November 14, 2000 1:38:23 AM

Edited by - bishop_pass on November 14, 2000 1:49:49 AM

Edited by - bishop_pass on November 14, 2000 1:52:45 AM

Edited by - bishop_pass on November 14, 2000 2:10:32 AM
_______________________________
"To understand the horse you'll find that you're going to be working on yourself. The horse will give you the answers and he will question you to see if you are sure or not."
- Ray Hunt, in Think Harmony With Horses
ALU - SHRDLU - WORDNET - CYC - SWALE - AM - CD - J.M. - K.S. | CAA - BCHA - AQHA - APHA - R.H. - T.D. | 395 - SPS - GORDIE - SCMA - R.M. - G.R. - V.C. - C.F.
Advertisement
I don''t think you have to know how many children you have. You just do a:

for( i=0; i<64; i++ )  if( children ) // != NULL    do_your_thingy_here(TM)... 


OK, raytracing is cool, but how about ROAM (even if I think that if Raytracing is faster iteratively than recursively, so would be ROAM... )

Tia,
Adrian
[ Libraries - STLport | boost | SDL | wxWindows ]
[ Manuals - MSDN | STL Docs ]
[ Compilers - VS.NET | MingW | DJGPP ]
[ Editors/Tools - EditPlus 2 | Anjuta | Dev-C++ ]
I''d love to see someone try to solve the Towers of Hanoi problem without recursion... heh

===============================================
If there is a witness to my little life,
To my tiny throes and struggles,
He sees a fool;
And it is not fine for gods to menace fools.
This is my signature. There are many like it, but this one is mine. My signature is my best friend. It is my life. I must master it as I must master my life. My signature, without me, is useless. Without my signature, I am useless.
Why not post a working C++ implementation (no psuedo or buggy code) of the recursive version? Perhaps the iterative version is easily adapted from one of the previous problems.

Edited by - WMiller on November 14, 2000 2:42:08 PM
quote: Original post by Big B

Languages like Scheme, ML, Lisp , and Prolog (my profs fav. language) make things impossible to do iterative solutions (or to have two statements in a row).



You cannot compare programming paradigms basing upon "computations costs". PROLOG is (usually) an interpreted language with a fairly complex run-time environment. C (as an example) is (usually) a compiled language with (almost) no run-time environment.

PROLOG is based on first-order-logic. It has *NOT* any iterative construct, so *YOU* *HAVE* to implement algorithms (opps, I would have said "solve problems") via a recursive approach.

Recursion can be (very) costly.

However, there are tecniques (like the "accumulator approach") and constructs (like the "cut" statement) that can be used to optimize *A* *LOT* PROLOG programs.

Those techinques (constructs) can be used (are present) in procedural languages like C. When used, they can speed-up problems implemented with recursive programming very efficiently.

Recursion is not as bad as it seems, IMHO. It has not to be used ALWAYS, though.

For example, when coding recursive functions (in a procedural language), you should not have a lot of function-arguments and you should not leave the calling-tree to reach endless depths... :-)


Bye,

Karmalaa

---
"Lifting shadows off a dream once broken
She can turn a drop of water into an ocean"

Edited by - karmalaa on November 14, 2000 4:46:16 PM
---[home page] [[email=karmalaa@inwind.it]e-mail[/email]]

This topic is closed to new replies.

Advertisement