Replacing the STL vector Class

Published February 19, 2008
Advertisement
I'm concerned that some people will read this post and all they will take away is "the standard vector class sucks". It doesn't. You should not go out and write your own replacement vector class. But to pretend that it's a perfect class in all situations is absurd, too. The purpose of this entry is to document a real life situation where replacing the vector class with a custom implementation was the right thing to do.

Also, don't give me any flak about how "STL" is a bad choice of terminology. I don't care.

Replacing the STL vector Class

At Day 1, we're not the sort of people who go around arbitrarily replacing standard code. We rely heavily on the Standard C++ library, as well as some fragments of boost. Our memory allocation solution is operator new and delete. In general, our code is modern C++, written by people who are familiar with the language as it is used nowadays. There's even plenty of template metaprogramming magic under the hood. The point is, replacing the standard vector class with our own custom implementation wasn't a matter of "not implemented here" syndrome*, but a carefully considered move.

Our technology depends very heavily on vector. It's used thousands of times throughout the codebase, practically any time a list of items is needed. They're used as temporary storage over the course of a frame a lot, as well. The end result is that the behavior of vector has a significant impact on the overall behavior and performance of our games. Thankfully, nowadays the "next gen" consoles, 360 and PS3, both sport fully featured standard libraries licensed from Dinkumware, the same people who provide the implementation for VS on PC. What that means is, for the most part, we benefit from competently written standard containers that behave sanely. For the most part.

Being licensed from the same people, sadly, does not mean the libraries are identical. In particular, the PS3 libraries use a somewhat older version than the 360 or PC. One important difference shows up between the two; on PS3, when you .clear() a vector, it frees its internal memory, and begins growth from the initial size all over again when you start adding things. Unfortunately, this has the effect of adding several hundred allocations per frame to our games in real life, normal load situations. (If memory serves, this same exact problem afflicts VC 7.1.)

There are some other odd quirks of the Dinkumware implementation. For one thing, vector iterator's constructor is apparently complex enough that the compiler is unable to verify that it has no side effects. What this means is that, if you have an iterator local to a function that you don't actually use, the compiler is unable to generate a warning about an unused local, because it's not sure that it does nothing. Along with that, the extra security and safety stuff in their implementation tends to add a lot of extra code size, especially just after compilation. The linker is able to filter out a lot of it (at a cost to link time), but it still affects the final code size. (The defines like _SECURE_SCL can help remove a lot of that, but it's still a concern.)

The most interesting bit of all, though, is that writing our own custom vector allows us to instrument it very heavily and gather all kinds of statistics on how vectors are used and abused throughout the engine and game code. This type of instrumentation simply isn't possible without heavy modification to the standard container, or a complete rewrite. (One could imagine a vector with an optional template policy parameter for instrumentation, I suppose, but the reality is that once you start doing that, there's a real compulsion to throw tons of random policies in.) With custom instrumentation on the vector, we can track almost any statistic we want. That information can in turn be used to tweak the growth pattern of the vector to suit our specific usage and to track places where we need to reexamine how the container is being used. We can also observe what types are used to instantiate the vector, and traits of those types (pointer types vs non pointer types, POD types vs complex types, etc). From that information we can also judge how effectively the compiler is folding identical code, which unfortunately is useful and necessary information thanks to some toolchain issues.

Our replacement, of course, is not that dissimilar from the library one in most regards. It's just been tweaked and simplified, and then instrumented. It's also written by a very skilled senior engineer and uses all the modern C++ tricks, including metaprogramming and type traits. Replacing a library container with a better version is no walk in the park, especially when you have to work without changing the external interface or the semantic behavior in any way. We have huge amounts of code using vectors that cannot break in the transition from STL to our custom container, after all. Given those restrictions, I would have a difficult time creating a replacement that had any significant advantages over the standard version, even though at a technical level I am equipped with the requisite skills to do so.

A little later, we decided that string, the other standard library class that our code relies on heavily ought to be looked at too. Doing an all out replacement was overkill (rewriting string is insanely difficult, especially when you have to integrate with iostreams and other pieces of the standard library), but we did want to use a custom allocator and leave the option to switch open. So like with vector, we switched away from using std::string and to a typedef in our own namespace instead.

I should point out that these were extremely late changes to the code. (I can't reveal precisely how late, of course.) We used the standard classes for several years before finally deciding that we needed custom code. Switching late is a total pain, but as long as you're thorough and careful, it works fine. If we had decided from the beginning that we wanted custom containers, we probably would've needlessly rewritten several others, like std::map and std::list. This isn't a decision that you have to make right from the start, like some people think. You do have to preserve the semantics of the original when you switch over, but the standard semantics are entirely reasonable; it's only the implementation which doesn't provide everything we wanted.

* "Not implemented here" refers to the compulsion by some that all the code they use has to be in house (or occasionally expensive licensed middleware) and not standard code or open source libraries. This is of course totally a totally ludicrous viewpoint, but it shows up a lot anyway.
Previous Entry Assert Sucks
0 likes 1 comments

Comments

rip-off
Interesting post. Pity you can't get in to the specifics, but a fascinating read none the less.
February 19, 2008 07:30 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement