Advertisement

local variable == SLOW?

Started by October 17, 2000 05:41 PM
33 comments, last by jho 24 years, 2 months ago
I was trying to find the bottleneck of my program which I thought it''s the math calculation that I''m doing but not so. I have my own vector3D and matrix3D class, their member functions get called quite frequently.. 1000 times per second or so. What I''ve noticed is that low level function like this: inline void matrix3D::mult(const vector3D &V, vector3D &result) { vector3D temp; .... } It would be faster if I take out the "vector3D temp;" local variable. My program has increased in speed noticably. Vector3D is a 40byte class. If I take out the math it doesn''t make much difference. I put more local variables of vector3D to test it and it can slow my program to a halt. If I replaced the vector3D class with a struct it''s ALSO noticibly FASTER! Can you confirm if classes are NOT the way to go for writing fast code? Is there something wrong with my vector3D class that''s making it slow to work as a local variable? What to do now? Should I rewrite my vectors and matrix with struct and a set of mult, add, functions like C style? Assembly is not an option for me.
Just put your constructor inline. If you don''t there will be a bunch of copy and it gets messy. To tell you the truth, my vector class is almost all inline. The only functions that aren''t inline are functions like rotate and such because they are to big.

You could also use a struct, it just depends if you want to hide or not the details implementation. If you have no intension of hiding the implementation then, please , use a struct. But inline constructor will give you the same speed as a struct.
Advertisement
A struct is the exact same thing as a class, just that the default privilege is public: instead of private:.

Whenever you create any object that has a constructor, you pay that object''s construction costs. If your vector3d constructor doesn''t do anything (i.e. = inline vector3d::vector3d () { }), it shouldn''t cost anything to create it on the stack as a local variable.

All variables that are created on the stack have their memory allocated at the same time. It doesn''t matter how big or how many local objects there are--it takes the same steps to allocate memory for all of them at once. However, you do pay to call the constructor of each object if it has one.

BTW, most treatises I''ve read on high-speed vector manipulation suggest that you overload their operators to return placeholder classes, and then evaluate the vectors only at the assignment step. Overloading each operator individual creates too many temporaries and tends to slow the program. I think this is what you''re seeing.
jho: If you''ve got a reference to the result in your multiplication function, why create the temp?


- null_pointer
Sabre Multimedia
Yes, classes is not the way to go if you want fast code.

Let me, however, rephrase that slightly:
Class/structs with methods are not the way to go if you want
fast code. This goes mostly for the constructor (as Stoffel pointed out), but holds for other methods as well. The speed-loss will be most noticeable for small classes (typically vector-classes and such), and so you should expect a speed-gain from making these classes data-only-classes.

In addition, unfortunately, using arguments in function-calls is also going to slow you down, so in some cases it will be faster to create a few global variables and have your functions assume that the correct values have been placed there before the function-call. This leads to difficult-to-read hard-to-maintain hard-to-debug code however, so it''s a matter of taste if the extra overhead is worth it or not...

- Neophyte

- Death awaits you all with nasty, big, pointy teeth. -
Neophyte:
Explain the results of the following:
    #include <iostream>#include <ctime>using namespace std;// the "C" way--no member functionsstruct IntStruct{    int x;};int multiply (const IntStruct* is, int y){    return is->x * y;}// the "C++" way--constructors & member functionsclass IntClass{public:    IntClass (int x) : m_x (x) { }    int multiply (int y) { return m_x * y; }private:    int m_x;};int main (int argc, char **argv){    int total=0;    // dummy part of algorithm, forces release build not                    // to optimize-out any of the calculations    if (argc != 2)    {        cout << "Usage: test secs_to_test" << endl;        return -1;    }    int testSecs = atoi (argv[1]);    unsigned long reps = 0;    clock_t stopTime = clock () + (CLOCKS_PER_SEC * testSecs);    while (clock () < stopTime)    {        IntStruct is;        is.x = reps;        total += multiply (&is, reps);        reps++;    }    cout << "Reps/sec in for C way: " << (double) reps / (double) testSecs        << endl;    reps = 0;    stopTime = clock () + (CLOCKS_PER_SEC * testSecs);    while (clock () < stopTime)    {        IntClass ic (reps);        total += ic.multiply (reps);        reps++;    }    cout << "Reps/sec in for C++ way: " << (double) reps / (double) testSecs        << endl;    return total;}    


In debug builds, C way gets me about 3.14e+6 reps/second, and C++ gets me about 2.78e+6 reps/second. C is 13% faster than C++.

In release builds, C gets me about 3.21e+6 reps/second, whereas C++ gets me 3.34e+6 reps/second. C++ is 4% faster than C.

This is done using MSVC++ 6.0 SP4 on an NT machine (750 MHz).

(BTW, I''m not totally sure of the answer myself--I tried looking at the assembly listing and there''s too much standard library stuff in their to get meaningful data. I expected both data points to be exactly the same).
Advertisement
quote: Original post by Neophyte

Yes, classes is not the way to go if you want fast code.

Let me, however, rephrase that slightly:
Class/structs with methods are not the way to go if you want
fast code. This goes mostly for the constructor (as Stoffel pointed out), but holds for other methods as well. The speed-loss will be most noticeable for small classes (typically vector-classes and such), and so you should expect a speed-gain from making these classes data-only-classes.
-


Bullcorn. If you write proper C++, use proper inlining, hinting, and a good optimizing compiler, the speed difference will be negligable. It''s not that C++ is slow, it''s that people do slow things in C++. Passing classes by value, pointer de-referencing, billions of virtual functions, no inlining...
Cool!
I''ve never had the time to compare the 2 in such a way (C and C++ that is), it''s nice to see. Just curious as to why assembly isn''t an option. If you''re using all single precision floats, the PIII has the SIMD instructions which''ll fly thorugh those vector and matrix operations. ALso both the INTEL and AMD site have tutorials and entire math libararies designed just for things like that. You could use these libraries and then plug them into your class''s. As long as the member functions are properly inlined there would no penalty for using class''s (as Stoffel was so nice to show us all!). Course, it''s only worth the bother if you want REALLY fast code.
The reason the class is faster is probably because of a few things:

(1) Your using the initialiser list inside the class, so the you''re copy-constructing the member variable instead of construct-then-assign (the way the struct works).

(2) A class function defined inside the class is implicitly inline, so the function call for multiply inside the class is free.

(3) In VC the ''this'' pointer is stored in a register when a class function is called, so access to ''this->m_x'' is faster than ''is->x''.
Hi, it''s me again. You don''t *really* need a temp for the multiplication Implementation. I eventually got rid of mine, I just have to pay attention not to call ::mult(A,A) that''s nasty because you''ll change the input before it''s done calculating.


Also I''ve noticed the same thing about Debug and release mode too. My analysis came from debug mode, but release mode the speed difference is not as noticable.


BTW my constructor do this {x=y=z=0.0} I used to have a homogenius w but I got rid of it because I am not using that.

This topic is closed to new replies.

Advertisement