Back to General and Gameplay Programming

Vtable overhead!

General and Gameplay Programming Programming

Started by pitchblack November 13, 2000 08:40 AM

11 comments, last by pitchblack 24 years, 1 month ago

pitchblack

Author

122

November 13, 2000 08:40 AM

Hi, I was just wondering how much overhead (regarding speed) the vtable introduces when using virtual methods in C++?!

/pitchblack

mhkrause

122

November 13, 2000 11:10 AM

In 95% of cases, not enough to be noticed.

The remaining 5% are things like tight inner loops where every cycle matters.

Meaning: Use them until profiling tells you the function look-up is killing you.

NuffSaid

122

November 13, 2000 12:59 PM

Might be a little off topic here, but does anyone know how to implement vtables in C? Not C++. Just plain old C.

==========================================In a team, you either lead, follow or GET OUT OF THE WAY.

JonatanHedborg

122

November 13, 2000 01:57 PM

I have a question. what is vtables?

=======================
Game project(s):
www.fiend.cjb.net

=======================Game project(s):www.fiend.cjb.net

Anonymous

November 13, 2000 02:03 PM

A vtable is just an array of function pointers. It would look vaguely like:

  typedef struct{    void **vptr;    // other stuff here}Object;void *Object_vtable[] = {    &Object_Func0,    &Object_Func1};void InitObject(Object *o){    o.vptr = Object_vtable;    // Other init stuff }

You''d have one vtable structure per type of object. When you initialize the object just set vptr to the correct table. To call something just index the method you want and cast it to the apropriate function pointer type, e.g. ((PBOJECTFUNC1) object->vptr[1])(...);

-Mike

Shannon Barber

1,684

November 13, 2000 09:14 PM

Once upon a time machines were so slow the performance penalty was great 5%-10% (when used efficently).
Today that''s dropped below 1%...

MFC & multi-threading had a 3% performance impact on my code. (450MHz TNT)

- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara

Marsupial Rodentia

122

November 15, 2000 09:29 PM

Though I haven''t bench marked it myself, I suspect that the double-indirected function call (calling a function via a pointer to an interface) has no overhead on Pentium II & III chips, it is the same number of clock cycles as any other function call (immediate mode addressing, direct mode addressing, or indirect mode addressing).

However, on RISC chips like MIPS and Alpha, I suspect it takes a couple extra clock cycles to accumulate the address of the function you are calling.

LilBudyWizer

491

November 16, 2000 02:33 AM

As I was reading some documentation today a thought occurred as to where this question may have come from. A good deal of documentation refers to a vtable being used to "look up" the address of a virtual function. This is misleading. It isn''t looking it up in a traditional sense in that it isn''t searching. It is loading a specific offset into a table. The example of the C implementation should make that clear. There is no search, you are just refering to a data member in a structure. It does result in an extra load of a register.

The class has a pointer to the vtable rather than the vtable actually being in the class. I believe you are going to have to load the address of the vtable before you can use indirection to jump to the function. A given class only has one vtable and it is the same for every instance. You are sacrificing speed over space, but the sacrifice is small.

Keys to success: Ability, ambition and opportunity.

SiCrane

11,840

November 16, 2000 12:52 PM

quote: Original post by Marsupial Rodentia

Though I haven''t bench marked it myself, I suspect that the double-indirected function call (calling a function via a pointer to an interface) has no overhead on Pentium II & III chips, it is the same number of clock cycles as any other function call (immediate mode addressing, direct mode addressing, or indirect mode addressing).

However, on RISC chips like MIPS and Alpha, I suspect it takes a couple extra clock cycles to accumulate the address of the function you are calling.

You forgot the opportunity costs for register allocation on the register starved x86, as well as cache misses on the lookup table, and cache pollution by loading the lookup table. Also, I''m pretty sure that even on a PIII, a double indirected function call still takes at least one more clock cycle. Especially seeing as no compiler I''ve ever asked to generate a virtual function call has ever done so in with a single instruction. (As far as I can remember it always emits a mov followed by a call . So even if both instructions are fetched in the same cycle, a pipeline stall would result as forwarding results of the mov occur at least one stage after needed by the call .)

I don''t know about an Alpha, but on a MIPS R2000 chip, it takes at least three extra cycles to call a virtual function. (lw, pipeline stall, lw, jret compared to just a jret)