Avoiding for loops
In several vector and matrix classes used for real-time rendering, I''ve often observed the use of code like
matrix &matrix::operator += ( const matrix &u )
{
m[1] += u.m[1];
m[2] += u.m[2];
m[3] += u.m[3];
.
.
.
return *this;
}
This seems to be preferable to using a simple for loop:
matrix &matrix::operator += ( const matrix &u )
{
for( int i=0;i<16;i++)
m += u.m;
return *this;
}
So I must put forth the question: why is there a tendency to avoid for loops? Is there some speed advantage? If so, is the speed advantage really that great and why?
</i>
Regards,
Tim
Regards,Tim
There is definately a speed advantage in removing loops from code, but it''s actually not as bad as you would think to have a loop such as the example you gave... mainly because any modern compiler would unroll the loop for you, as there are a fixed number of iterations.
Manual loop unrolling was something that mattered way back when optimising compilers were not so optimising
I personally would write it with the loop in (for the sake of clearer code), and let the compiler do its thang when I do a Release build... muc of a muchness really.
Manual loop unrolling was something that mattered way back when optimising compilers were not so optimising

I''m not really sure what the difference is when the code is actually compiled, but my first thought is that a for loop is definately eating up more cycles. By just hard-coding the lines in with the references, you can see exactly what you''re giving to the compiler. With a for loop, there may be initialization of a variable for the counter, and plus it checks the counter against the limit until it breaks the loop, and it has to increment the counter. This may not seem like a big deal, but when you''re calling it many times every game loop, it can be a lot of extra work. There is just no need to use a for loop in this example. For apps that don''t need great speed, it is often easier just to use a for loop since the time difference isn''t noticable when used a few times.
[JESUS SAVES|Planet Half-Life]
[JESUS SAVES|Planet Half-Life]
One of my professors was telling us about pipelines in CPUs. Basically, what you have is instructions that take, say 20 steps to process. So you stick an instruction on the pipeline, and 20 steps later, it''s finished. Like an assembly line. And at any one time, 20 consecutive instructions are being processed in that line. But when the machine comes to a part of code that checks a condition (if, for, while, switch, etc.) the CPU won''t know which instruction to put next on the pipeline. Like this:
...
if( a < b )
a = 1;
else
a = 2;
If you had that, the CPU has to guess if it''s going to have to start the "a=2" or "a=1" instruction on pipeline, and if it''s wrong, it has to scrap everything on the pipeline and start over with the correct instruction. Anyway, so it''s a similar situation with any other checking statement, and a for statement is like having an ''if'' statement for every time it goes through. It''s a little thing, I mean the pipeline stuff happens extremely fast, but if you''re doing 10,000 if statements a frame, it can start to affect performance.
Just thought it was something interesting I learned about. It''s one place that the "tendency to avoid for loops" comes from. For your case, though, I agree with the first response.
...
if( a < b )
a = 1;
else
a = 2;
If you had that, the CPU has to guess if it''s going to have to start the "a=2" or "a=1" instruction on pipeline, and if it''s wrong, it has to scrap everything on the pipeline and start over with the correct instruction. Anyway, so it''s a similar situation with any other checking statement, and a for statement is like having an ''if'' statement for every time it goes through. It''s a little thing, I mean the pipeline stuff happens extremely fast, but if you''re doing 10,000 if statements a frame, it can start to affect performance.
Just thought it was something interesting I learned about. It''s one place that the "tendency to avoid for loops" comes from. For your case, though, I agree with the first response.
NeHe''s code is meant as ''tutorials'', right? The reason for avoiding "for"s, as I see it, is to make it more clear what each operator does.
"Let me just ejaculate some ideas"
quote:
Original post by LuckyNewbie
NeHe''s code is meant as ''tutorials'', right? The reason for avoiding "for"s, as I see it, is to make it more clear what each operator does.
Granted, but I''ve seen it in many classes not intended for tutorial purposes.
Regards,
Tim
Regards,Tim
quote:
Original post by granite811
So you stick an instruction on the pipeline, and 20 steps later, it's finished. Like an assembly line. And at any one time, 20 consecutive instructions are being processed in that line.
Always having a full pipe is not very common. For example, if you have two consecutive instructions A and B, where B depends on the result of A, you cannot execute B until A is done. There are out of order execution, where instructions are executed not in the order they are given in the code, but still in such a way that the result in unaffected. But 20 instructions at the same time is more or less only in the ideal case.
quote:
Anyway, so it's a similar situation with any other checking statement, and a for statement is like having an 'if' statement for every time it goes through.
Any decent processor with a pipeline depth of 20 will also have a decent branch predictor. Not sure what kind of predictor todays processors use, but I would guess it's based on something like "do whatever we did last time". That is, when the processor hits a branch, it will take the same direction as it did last time the same branch was hit (assuming it's still in the cache) (edit: it will, of course, only start execute the instructions in that direction and discard the result if the predicion was wrong). So for a for loop, where the branch is the same as long as the exit condition is not met, the cost of branching should be virtually zero due to the processor guessing the right direction.
I'm no expert on this area, but the above is at least what I think is correct.
[edited by - Brother Bob on June 9, 2003 7:11:57 AM]
Personally what i do is write a metaprogram to unroll the loop, some compilers refuse to do unrolling of loops (due to mixed results) unless you specify a compiler flag, and since you don''t really want all loops unrolled, just certain ones, i found that the best solution for me.
Stick to the for loops, processors these days are extremely fast and if you are doing games, performace is almost always dependant on the gfx card then the CPU.
I have a GF2MX100 and a P42.2ghz while a friend of mine with a duron 733 gets better framerates than me in most games because he uses a GF2MX400.
Unless you are programming for handhelds, stick to for loops as it is more readable and less code.
Today people are using Virtual Machines such as Java and .Net which are alot slower than native c++ code because todays processors are fast enough to handle them. As it is, games dont really use alot of processing from CPU these days, more from GPUs. RPG's, Action, strategy and all genres of games were made on Genesis, SNES which were 6mhz and 3mhz respectively. If that hardware could handle the games, todays mordern processors should not have any problem.
The only difference in todays games and yesterday's games it the graphics which is handled by the graphics card. So as far as CPU usage goes, dont worry, its the stuff that is sent to the gfx card which matters more.
[edited by - GamerSg on June 9, 2003 8:55:15 AM]
I have a GF2MX100 and a P42.2ghz while a friend of mine with a duron 733 gets better framerates than me in most games because he uses a GF2MX400.
Unless you are programming for handhelds, stick to for loops as it is more readable and less code.
Today people are using Virtual Machines such as Java and .Net which are alot slower than native c++ code because todays processors are fast enough to handle them. As it is, games dont really use alot of processing from CPU these days, more from GPUs. RPG's, Action, strategy and all genres of games were made on Genesis, SNES which were 6mhz and 3mhz respectively. If that hardware could handle the games, todays mordern processors should not have any problem.
The only difference in todays games and yesterday's games it the graphics which is handled by the graphics card. So as far as CPU usage goes, dont worry, its the stuff that is sent to the gfx card which matters more.
[edited by - GamerSg on June 9, 2003 8:55:15 AM]
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement