Advertisement

Alternative to flexible array members for avoiding multiple allocations

Started by May 06, 2018 03:52 AM
5 comments, last by Hodgman 6 years, 6 months ago

Hi! I reckon that to handle for example some texture asset in a game, the following would be a more or less elegant C++ way (not counting that everything is public here) :


struct Texture
{
    int width, height, pixel_size;
    std::vector<char> pix;
};

Then I could create an instance "tex", get the image width, height, and bytes per pixel value from the image file, set the appropriate fields, then tex.pix.resize() accordingly and fill the vector with the pixel data. Anything that processes the pixels (OpenGL for example) could then use &tex.pix[0] as a base pointer.

However, after I allocate dynamically for the Texture, the pixel array will need to be allocated separately by std::vector. I would like to avoid this (some reasons are below), instead, I would like the array be part of the struct, which is hard, because my array does not have a fixed size. There is something in C99 called "flexible array members", which allows the following syntax:


struct Texture
{
    int width, height, pixel_size;
    char pix[];
};

but only if the array without given size is at the end of the struct. This is only in C99, not in the C++ standard, but gcc apparently supports it for C++. Another trick is to use an array of 1 element, like this:


struct Texture
{
    int width, height, pixel_size;
    char pix[1];
};

In this case, one could use it like this:


    // obtain image width, height and bytes per pixel,
    // and store them in local variables w, h, n

    char * ptr = new char[sizeof(Texture) - 1 + w*h*n];  // pix array will have w*h*n elements
    Texture * tex = reinterpret_cast<Texture*>(ptr);
    tex->width = w;
    tex->height = h;
    tex->pixel_size = n;

    // fill tex->pix[] with pixel data
    // use the texture

    delete [] ptr;

My first question: is it safe? Some say that it leads to undefined behaviour when a fixed (here: 1) size array is overindexed, because higher optimisation levels could lead to something unwanted. But I've seen an excellent open source C++ game use this technique at one point (allocating with calloc() instead of new char[]), so it should work (most of the time?).

Some reasons I desire something like this are:

1. "Data locality" i.e. I would like the metadata be close to the actual data in memory, so that it is not necessary to follow another pointer to a potentially far away location.

2. I would like to keep the asset data structures (such as this Texture example) as plain old C data (without non-trivial destructors or virtual functions etc.), to avoid the overhead of another allocation/deallocation, or the risk of leaks caused by missed destructor calls. I wouldn't really take advantage of the dynamically-expanding std::vector anyway: the size of the asset doesn't change after loading. A simple character array is almost good enough for me: if the asset is a long text, I don't need anything else, but for textures I also need width, height, bytes per pixel, for PCM data I need sample size, number of channels etc.

In short, an asset in most part consists of a large byte sequence, but in many cases I would like to smuggle some metadata in front of it.

How should I do this? Is there anything more suited than structs with flexible array members? Or should I follow a completely different approach?

 

I'm confused, I think... What is wrong with the following data structure to hold texture data?


struct Texture
{
	unsigned int width, height, bits_per_pixel;
	char *data;
};

// Or:

struct Texture
{
	~Texture() { delete [] data; }

	unsigned int width, height, bits_per_pixel;
	char *data;
};

It's not like this is going to be a chunk of code that turns up in any profiler unless you do insane things with it.  The only way you'll "miss" the destructor call is if you fail to delete the Texture struct itself.  If you want to go full bore C++, which is perfectly fine, you can use your first example struct.  Use smart pointers to manage the texture's life (unique_ptr and use .get() to pass around a raw pointer, check for nullptr and you're off to the races).

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

Advertisement

I do understand where you are coming from, I don't know whether there is any inbuilt support for this kind of 'dual allocation', you could write one yourself if you wanted easy enough (you could for example make a factory that creates / deletes these for you, and casts them to the basic struct, and internally handles them as byte arrays).

Whether as CrazyCdn suggests it would make any significant difference in practice in terms of performance, probably not in the scenarios you suggest (texture, audio samples), but possibly in some cases (lots of short variable length string objects maybe?).

If you are interested in data locality, alignment and simplifying allocation (and faster file access) an alternative you may also want to look at packing resources together into a 'pre-baked' format of your own design, you then load the resource file as one allocation, and use it in place in memory, by 'fixing up' pointers within it.(http://tomhulton.blogspot.co.uk/2011/12/load-in-place-data-structures-and.html).

It's perfectly fine to allocate enough memory for both data and casting the type, but why would you want to do that? If it's for performance reasons, it won't be noticeable because a pointer dereference is nothing compared to the cost of binding a new texture to the GPU. If you have lots of tiny textures, the best thing you can do is put them all in a single sprite sheet and then in your code, organize your draw calls so that every single one that uses this sprite sheet are batched contiguously.

You may be right, probably I was once again too efficiency-conscious too early in development. Thanks for the feedback from all of you!

My idea for loading/managing asset data in memory was basically exactly what @lawnjelly suggested in the first paragraph. Some loader functions would act as factories for the appropriate asset type (doing allocation and filling it with data). My "asset manager/locator" thing would maintain different kinds of assets in one place, so these would be handled internally as char (smart) pointers anyway. In case of plain old C structs, the assets could be deleted via these char pointers without any problem.

But since you all recommended against this forced single allocation I had had in mind, in the end I will probably do it the C++ way instead (with a contained std::vector or something similar in the asset, like in the first Texture example). Of course, then the above method won't work: when deleting the char pointer, the destructor of the contained std::vector won't be called - this is what I meant when I wrote about risks of missed destructor calls. What I plan to do is make a base class with nothing else but an empty virtual destructor, and all assets will be subclasses of this. Instead of char pointers, I will maintain pointers of the base class.

When the smart pointers (in my case all shared_ptrs pointing to the same asset) are invalidated, delete is automatically called -> due to the virtual destructor, the destructor of the specific asset (e.g. Texture) will be called -> the destructor of std::vector will also be called.

20 hours ago, dp304 said:

1. "Data locality" i.e. I would like the metadata be close to the actual data in memory, so that it is not necessary to follow another pointer to a potentially far away location.

In this case, maybe you've got something like:


for( y=0; y!=height; ++y )
  for( x=0; x!=width; ++x )
    texel[y*width+x] += 42;

In the general case, a texture might be megabytes in size, but your cache is only kilobytes... You will see a very small improvement, because when loading width/height, you will also load the first 14 or so texels from the array...  So maybe a 0.001% improvement for a 1k x 1k texture? Not really an issue that should keep you up at night :)

20 hours ago, dp304 said:

or the risk of leaks caused by missed destructor calls

The only way to miss destructor calls is if you already have leaks in the first place...

20 hours ago, dp304 said:

for textures I also need width, height, bytes per pixel, for PCM data I need sample size, number of channels etc.

Usually, you only need the texels in RAM for a very short amount of time -- you hand them over to GL, and then you deallocate them. However, you probably want to keep the metadata around for a long time. Putting both of them into a single object with a single lifetime means that you can't do this.

This topic is closed to new replies.

Advertisement