Advertisement

Scott Bilas Generic Handle-Based Resource Manager

Started by January 24, 2021 12:30 PM
14 comments, last by Shaarigan 3 years, 11 months ago

Hi,

I have been working through the article ‘Gem: A Generic Handle-Based Resource Manager’ by Scott Bilas trying to understand and implement the resource handle system in C++. There are some points in the article that I'm having trouble understanding:

  1. What does Acquire do (provide access to the actual handle contents?) and how is it different from dereference?
  2. What is the purpose of the NameIndex map and why is this useful in GetTexture?
  3. What does GetTexture do exactly? In particular the !tex->load if-statement block.
  4. Is the HandleMgr the cache and the TextureMgr the store, in resource manager terminology?

Thanks.

Not read the entire article, but from the code:

Acquire pulls an entry from the free list, and (re-)initializes to blank, so looks like the equivalent of “malloc” to me, making a new handle. Dereference returns the base pointer to the data from a valid handle, it seems.

“typedef std::map <std::string, HTexture, istring_less > NameIndex;” NameIndex maps a string (name of the texture I think) to a texture handle.

tex→load(..) calls the ‘load’ method of tex, which likely tries to load the texture, and return a boolean value about the result. The “!” inverts the returned result, thus the “if” likely holds if loading fails.

HandleMgr is a class template, ie a generic blue-print for handle managers. TextureMgr is an example manager for textures, where “typedef HandleMgr <Texture, HTexture> HTextureMgr;” says HTextureMgr is the “real” manager, and it adds some wrapper functions to make using it simpler, and also implements loading I see:

bool TextureMgr::Texture :: Load( const std::string& name )
{
    m_Name = name;
    // ... [ load texture from file system, return false on failure ]
    return ( true /* or false on error */ );
}

Advertisement

@Alberth Thanks for the reply. I've written up my thoughts on what each of the problematic functions does. What do you think?

Acquire()

If the free list is empty

- index stores the last magic number on the magic numbers vector m_MagicNumbers.

- sets the passed in Handle index to index.

- default constructs a DATA object and adds it to the end of the DATA objects vector m_UserData.

- adds the (incremented in init() magic number for the handle to m_MagicNumbers.

If the free list is not empty

- store the most recent free handle index in index.

- sets the passed in Handle index to index.

- remove the free slot from the free slot vector.

- update the m_MagicNumbers vector at index position with the new magic number.

Return the memory address for the DATA item created in m_UserData.

[Block A: If the free list is empty.]

[Block B: If the free list is not empty.]

Block A seems to give an existing handle a valid index and magic number and default DATA object and then by returning a DATA* for that handle allow the user to set its stored DATA. Block B updates the index and magic number without default initialising the DATA object.

Dereference()

- check if the index and magic number for the handle are 0, if so return null

- store the handle index

- check the index of the passed in handle doesn't index outside the bounds of the DATA vector.

- check the magic number for that index matches the one stored in the passed in handle.

- return the memory address of the DATA item for this handle

Dereference seems to be the function to call when you want to retrieve the data in a handle (i.e. read). Acquire should be called when you create a handle (but it doesn't have a valid index/magic number yet) and the objective is to assign data to it (write).

GetTexture()

- try to insert a name, handle pair into the name, handle pair map.

- if the insert succeeded, Acquire is called on the handle of the newly inserted element. The DATA template parameter in the Texture example is deduced to have the type Texture, therefore Acquire here returns a Texture*.

- load is called with the name passed into the GetTexture function as its argument.

- if loading fails call DeleteTexture on the handle we inserted into NameIndex.

- default construct a new handle in place of the handle currently in NameIndex.

- return the handle inserted into NameIndex.

NameIndex stores name -> handle mappings, allowing a user to retrieve a handle based on the name of a Texture. A user would then use the returned handle in a call to Acquire, Dereference or Release.

A load() method in a resource manager would therefore use GetTexture() to retrieve a Texture for calling code.

MasterReDWinD said:
- index stores the last magic number on the magic numbers vector m_MagicNumbers.

No, m_MagicNumbers.size() is the length of the "m_MagicNumbers" vector.

Since vector indices start at 0, the length is also the index of the next element that gets appended in the push_back method (C++ speak for appending an element) 3 lines down.

In other words, "index" is the index of the new element that gets created in the storage vectors. That index is stored in the handle as well. New data is created at 'index' (since it didn't exist yet) and the created magic number is also stored at that index in a different vector.

MasterReDWinD said:
Block A seems to give an existing handle a valid index and magic number and default DATA object and then by returning a DATA* for that handle allow the user to set its stored DATA.

The caller makes an empty handle (probably simply making a variable "Handle myhandle;" is sufficient), and passes it into the Acquire function with “HANDLE& handle”. This means that writes into handle within Acquire actually end up in the myhandle variable of the caller. The Acquire function fills the handle with data (index where the data is stored + a magic number to make sure 2 handles with the same index number can be distinguished).

MasterReDWinD said:
Block B updates the index and magic number without default initialising the DATA object.

Block B has the same exit invariants as above, except it re-uses an already existing entry.

In Block A, it creates a new entry with "push_back DATA())". In B, the entry already exists, no need to create it. (Just like malloc doesn't give garantuees about contents of memory, it only makes sure some area is allocated so it won't be used for other purposes. Caller has the responsibility to initialize the acquired space.)

It's like a new house. You get space, but it's literally completely blank, concrete walls and floors. No carpet, no furniture, no decorations. "malloc" gives you the empty house, you have to decorate (initialize) it to your liking.

MasterReDWinD said:
- check if the index and magic number for the handle are 0, if so return null

The null-handle means "unused handle":

bool IsNull(void) const  { return ( !m_Handle ); }

If m_handle is 0, !0 is 'true', so the handle "Is Null" (ie "not used" or “not valid").

MasterReDWinD said:
Dereference seems to be the function to call when you want to retrieve the data in a handle (i.e. read).

Not exactly, you get a pointer to the memory, so you can do both read and write.

The checks before it are sanity checks (don't try to access a non-existing handle index, and check that the magic number of the handle matches with the number that was set during Acquire).

The magic number is for catching the following scenario:

  1. acquire a handle (this will be stored at some index i)
  2. release that handle but keep it around (makes index i available again as last free entry)
  3. acquire a second handle (gets index i as well since it's the last free index)
  4. try to dereference using the first handle.

4 fails, because the magic number of step 1 doesn't match the magic number of step 3.

MasterReDWinD said:
Acquire should be called when you create a handle (but it doesn't have a valid index/magic number yet) and the objective is to assign data to it (write).

Acquire gives you empty space (a fully empty house). It may be an existing (empty) house, or a new one. Caller doesn't care if it's new or not. You get entrance to the house in both cases (a pointer to the data memory), and you can decorate the house as you like (or not, although it's likely you want to decorate, not much fun living in a fully empty house). As a side-effect, you also get a handle (an address of the house).

Dereference takes the address, checks that you are the last person that acquired the house (ie you are the rightful owner as you haven't release it back to the pool of empty houses yet), and then it gives you entrance again so you can read what you stored before, and/or write new things.

MasterReDWinD said:
- if the insert succeeded, Acquire is called on the handle of the newly inserted element. The DATA template parameter in the Texture example is deduced to have the type Texture, therefore Acquire here returns a Texture*.

Success on insertion in a map means the key (the name) didn't exist (it could be inserted as new entry), ie you queried a texture with that name for the first time. So you need a new storage place (a new empty house) for the new texture.

template parameter deduction is correct.

MasterReDWinD said:
- load is called with the name passed into the GetTexture function as its argument.

Yes, so apparently, name is also the filename of the texture.

MasterReDWinD said:
- if loading fails call DeleteTexture on the handle we inserted into NameIndex.

The "m_Textures.Acquire( rc.first->second );" gave us an empty house with entrance rights. Since we failed to find the texture, we have no need for that space, so we return it to the pool for other users.

MasterReDWinD said:
- return the handle inserted into NameIndex.

The "return" path is used both by new entries and by already existing entries. It gives the handle connected to the texture, so other code can get entrance, and (probably) use the texture.

For a new entry that failed loading, the null-handle is associated with the new entry and returned, meaning "there is no data for this name". (And there literally isn't, as the empty house was returned to the handle manager.)

This also means that if you try to query the texture with the name again, you simply again get the null-handle, it doesn't try to load the texture from the file system again.

MasterReDWinD said:
NameIndex stores name -> handle mappings, allowing a user to retrieve a handle based on the name of a Texture. A user would then use the returned handle in a call to Acquire, Dereference or Release.

Acquire is silly. You ask the texture manager for a decorated house (a handle that points to the loaded texture), and then you ask for a new empty house. It's simpler to create a new handle yourself and access the handle manager directly.

Release is probably a bad idea. The handle manager manages a handle globally, it has no knowledge how many copies of handles exist. If one person says "I don't need this space anymore", the handle manager invalidates the space and all handles to that space (that house) become invalid. Obviously, the latter includes the handle in the name->handle map.

@Alberth Many thanks for taking the time to address each of those points, it has really helped. There is a lot of subtlety with this system and I'm still working through it. I have a basic (as yet untested) implementation now, with some attempted additions: a linked list instead of the free vector, and also reference counting on the handles by implementing an equivalent of the 'releaseTexture' function rather than ‘deleteTexture’.

The article mentions wrapping the magic number. Is there a benefit to this other than keeping the magic number from getting too large?

I'm trying to test this handle system with my model parsing class. At the moment I'm relying on dependency injection to pass the ‘TextureMgr’ (Texture_store in my system) to the Model class so that it can access the ‘HandleMgr’ (Resource_cache<Texture>, Resource_handle<Texture_h>) to load textures. I also have a separate Texture_loader. This will mean passing a lot of different store objects to a lot of different systems once I incorporate Mesh_store, Material_store, Shader_store, etc which does have me worried. However this approach seems to be much prefered over a singleton style object. Does this sound like a viable way of using the handle system?

This logic is based on trying to follow the store, loader, cache, proxy single responsibility system and having a separate resource manager for each type of resource. Concepts that come up in many other resource manager threads on here.

The cache is the data backing that holds the ones currently loaded in memory and dumps them under various rules. The cache owns the lifetime. and the TextureMgr as the store. The store is the external interface, providing access to the resource. The store is the factory, not the resource loader.

MasterReDWinD said:
a linked list instead of the free vector

Linked list elements tend to get scattered across memory, which may play havoc on the CPU cache if you do a lot of free/malloc, but it quite depends on your creation and release patterns. Otherwise, it seems fine to me.

MasterReDWinD said:
also reference counting on the handles

I guess it depends on your ability to release all handles throughout your code. On the other hand, it does catch the case where everybody just drops the handle, and nobody tells the storage about it. The original probably assumed different parts did not coordinate release, it may also have been left out due to limitations in the length of the article, or for keeping the explanation clear.

MasterReDWinD said:
The article mentions wrapping the magic number

Eventually it will of course hit MAX_INT. In those days that may have been a 32 bit number, which for the article is long enough. (It's not core, so the author doesn't want to spend lines of text on it.)

If you switch to 64 bit, it becomes a non-issue at the cost of long magic number. Given that you have a limited number of different stores in the program, I would probably spend a few bits of the number on the type of store, so it's impossible to use a magic value at any other store from where it was created.

Wrapping is of course an option, it depends on how bad it is if you would get the wrong eg texture at some point from the store. If you do implement reference counting, you can check if a given magic number is in use (at least in theory), and skip it.

MasterReDWinD said:
‘TextureMgr’ (Texture_store in my system) … so that it can access the ‘HandleMgr’

Sounds a bit complicated, but I have a hard time picturing what you're doing. I would probably make the handle manager a base class, and derive the various stores from it. The base class deals with handles and memory, the derived class configures the base class on initialization, and provides a loading service to the base class in the form of a virtual function, and converts types of data between the base class and ‘outside’, for as far as that is even needed.

Practically, the base class would do 90% of the work, and the derived class mostly has a constructor to setup the base class, and an override on some generic load virtual function call from the base class. To a user from the outside, the derived class does everything that is needed (getting handles, getting the data associated with the handle, and releasing the handles). In reality all that is done by the base class inside.

MasterReDWinD said:
I also have a separate Texture_loader.

One option is to make a wrapper class, that packages all the various objects into a single cohesive class. Inside, you can have all the various objects separately.

I agree with a separate store for each different type of thing. Unlike you I tend to avoid splitting up functionality into many classes, but that's likely a matter of preference. (To me, conceptually different parts doesn't automatically imply physically different classes.)

MasterReDWinD said:
The store is the external interface, providing access to the resource. The store is the factory, not the resource loader.

It's a matter of perception I think. While technically my derived class above wraps the base class, the derived class doesn't even implement the external handle interface, the derived class only implements an internal load function for the base class.

The external user thus may seem to talk to the derived class, but it never actually accesses code from the derived class directly. All external user calls arrive directly at the base class. Does that make the derived class a factory? I think you can argue either way.

But this is all very vaguely defined territory, with many good solutions. Implement it so it feels right to you.

Advertisement

Alberth said:
Sounds a bit complicated, but I have a hard time picturing what you're doing. I would probably make the handle manager a base class, and derive the various stores from it. The base class deals with handles and memory, the derived class configures the base class on initialization, and provides a loading service to the base class in the form of a virtual function, and converts types of data between the base class and ‘outside’, for as far as that is even needed. Practically, the base class would do 90% of the work, and the derived class mostly has a constructor to setup the base class, and an override on some generic load virtual function call from the base class. To a user from the outside, the derived class does everything that is needed (getting handles, getting the data associated with the handle, and releasing the handles). In reality all that is done by the base class inside.

@alberth Thanks again for your help.

I see, so in your case HandleMgr would be the base class and TextureMgr/store would derive from it. The derived class/store would pass through all the Get* calls to the base class and only interact with the loader directly, via the virtual load() function?

As part of implementing I've noticed that I have to choose where I want to get the actual data out of the handle. Is there some kind of best practice on this do you know? Currently my Model, Mesh and Material classes all store texture handles and I'm assuming that the actual Texture data (which is really an OpenGL int reference) should only be retrieved at the last step, somewhere in the Renderer?

@Alberth I have the implementation running now, with a single texture for the time being. However I've having some trouble figuring out the reference counting. I initially built the reference count into the handle but I've realised that this is incorrect. The reference count needs to be on the objects in the HandleMgr UserData vector. The problem is that the handle has no way to communicate with the HandleMgr to tell it to increase the reference count of an object in the UserData vector. I don't want to make the HandleMgr a global object. Passing the HandleMgr into the constructor of each reference is also not really an options.

You touched on the reference counting aspect in your earlier reply. The article only says “Add automatic reference counting as standard functionality, rather than leaving it to be the responsibility of the owner of HandleMgr.” It makes it sound like there is an obvious way to add it. Do you have any idea what the author might of had in mind?

Havent been asked but: Ref-counting only works if all handles have access to the same memory so you either have to let your manager create those handles and handles need to carry a pointer to the manager or you have buckets of memory (so they never change location) that holds a resource info struct and perform ref-counting into these structs.

I'd prefer the second one because less indirections and handle could also provide some useful info to functions that work on or with the data

@Shaarigan Thanks for adding your input all the same.

I didn't consider having the manager create the handles but I did consider every class that needs handles having a reference to the manager so that it could access a pointer to a Texture stored in dynamic memory in UserData. Passing the manager into everything doesn't seem tidy though so I agree option one is not optimal.

I'm missing something in trying to understand option two. The manager stores structs in UserData, rather than Textures directly. Then those structs store a pointer to a dynamically allocated texture, and a pointer to a dynamically allocated int reference count? How does the handle increase the reference count in the struct though?

This topic is closed to new replies.

Advertisement