Advertisement

Implementing an internally linked list via inheritance

Started by May 06, 2017 09:43 PM
17 comments, last by SeanMiddleditch 7 years, 6 months ago

you can totally build an intrusive container/refcounter/etc. using just composition

Fair enough. What I should have said was that an unintrusive container would be less hassle to maintain and more likely to be useful for reuse in the future.

I don't quite get the whole ECS stuff anyway

Easier to give power to scripters and to maintain (you can rewrite a component as a new version and then swap it out with the old version without shutting everything down). IIRC this was popularized by MMOs, where those kinds of things outweigh everything else.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Unity uses a tree, not a list. Nodes can have more than one child.

The transform hierarchy in Unity effectively creates a scene graph - the each transform is added to its parents. If this is the behavior you're looking for then it would be good to focus more on how to effectively implement this behavior rather than on this specific data structure. (Please specify if this is the case.)

There's no reason to use intrusive linkage. Prefer composition over inheritance.

A scene graph of Transforms is absolutely something I want to get to, but it'll be a little while until that's in scope. I'll probably make a new topic if/when I get stuck on it.

A linked list is the wrong data structure to use here (a linked list is rarely the right data structure to use these days). [...] You're mainly concerned with adding/retrieving components by type id, right?. So an unordered_map seems to be the obvious choice. If you need to support multiple components of the same type on a game object, then use unordered_multimap.
Allow me to add that std::unordered_map (and even more so std::map) is also rarely the right data structure.

A std::unordered_map (empty, of any type) has an overhead of 56 bytes on my platform, which is already not that neglegible for something that's added to every node just like that.

But it gets worse: As easily observed by overloading global operator new, adding one element to an unordered_map typically does two dynamic allocations (for std::unordered_map<int,int> 16 and 24 bytes, respectively, on my platform). Adding two elements does three dynamic allocations (16, 24, 16 bytes). Adding three elements does 5 allocations (16, 24, 16, 16, 56 bytes). Adding 25 kv-pairs does 29 dynamic allocations. Adding 2,500 kv-pairs does 2,510 allocations. It's asymptotic for sure, but for small N it's a bummer of overhead just for managing a few key/value pairs.

On the other hand side, for large N, the fact that std::unordered_map has a cache coherency only rivalled in its badness by std::map, the guaranteed-cache-miss-every-time factor comes into play. So... it's neither really a stellar performer for small N, nor for large N.

I don't quite get the whole ECS stuff anyway (well, I do get it insofar as I see how it's nice and flexible, I just don't get how it can actually ever, in practice, work with an acceptable performance, but let's ignore whether or not I get it... that's irrelevant) but if I was to do such a thing, my first choice, until proven wrong, would certainly be a small_vector (such as clang uses internally), the expectation being that the average node does exactly zero extra allocations and that linearly iterating over 5-8 elements is just as fast as doing one hash table lookup.

But that's just me.

Didn't expect such solid programming insight from the guitarist of Emperor.

Really good points, and yes, there should be <10 components on nearly every GameObject, but it is supposed to be a somewhat robust system so there may be cases of 100+ components in a single container. It is fair to assume that most usages will never exceed 10 components in a single container though.

The main point that stuck out to me about hash mapping was that it uses the runtime types as the lookup, which bypasses type-checking in a linear iteration (maybe there's a better way to handle linear polymorphic iteration that I don't know about). Since components are generally looked up by their type, this seemed pretty efficient.

I guess now I'll have to run some tests on the speed and storage differences between linearly iterating a vector with type-checking vs. a map. If I find anything notable I'll post it here.

Advertisement
I'm not a fan of overgeneralization, so this may not work for what you want to do, but I wouldn't tie "look up component by type" in with runtime type information.

Instead, I'd store multiple vectors, one for each type of component. An empty non-allocated vector is cheap enough, and you can group them logically into a tree of components and component-groups if necessary.

Then, "give me components of type X" becomes an O(1) operation with no reflection or runtime type data needed. You just write a function that returns the appropriate vector.



Like I said, this is not purist ECS and it probably will aggravate folks that are insistent on that purity. But it's food for thought.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I don't quite get the whole ECS stuff anyway

Easier to give power to scripters and to maintain (you can rewrite a component as a new version and then swap it out with the old version without shutting everything down). IIRC this was popularized by MMOs, where those kinds of things outweigh everything else.
You get those same ability with regular, plain, no framework required, composition.

The typical "ECS framework", which honestly is a complete and utter anti-pattern, was popularised by a programmer froma first person shooter who declared it the future of MMO design on his unfortunately influential blog series.

maybe there's a better way to handle linear polymorphic iteration that I don't know about

The answer is to unask that question and don't do that.
If you want performance, you want data locality and code locality. That means not using polymorphism, or only using it at a higher level, such as a whole system of components instead of a single component (i.e. the S in ECS framework).
To paraphrase Acton, multiple objects is the general case. When was the last time you wrote a compoment that was only instantiated once? One object is a corner case. Write code for the general case, which means writing batch processing functions.

IIRC this was popularized by MMOs, where those kinds of things outweigh everything else.

Nope. It was someone who claimed it would be the future of MMOs. I've worked on MMOs for most of the 10 years since that blog post was written and I've not seen or heard of a single one use a component-based architecture, never mind the extreme component/entity/system separation that blog talks about.

(Oh, just seen that Hodgman mentioned this already. Oh well.)

I'd strongly advise anybody who wants to break their game entities into components to really think about what benefit they're hoping to get from it, and plan that up-front rather than trying to follow any abstract rules they may have read. The existing literature on component-based games systems is mostly evangelism and very little practicality, often with conflicting aims or goals that are not relevant to new developers.

You get those same ability with regular, plain, no framework required, composition.

Isn't ECS just composition ad absurdum? It seems like every time I talk to people about ECS the boundaries of its definition are different. -.- My impression was that ECS is a system where new types of component can be created and injected, and entities are just clumps of various components constructed at runtime by data.

Nope. It was someone who claimed it would be the future of MMOs. I've worked on MMOs for most of the 10 years since that blog post was written and I've not seen or heard of a single one use a component-based architecture, never mind the extreme component/entity/system separation that blog talks about.

Nice. I'll extend my procrastination about messing with it.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.
Advertisement

You get those same ability with regular, plain, no framework required, composition.

Isn't ECS just composition ad absurdum? It seems like every time I talk to people about ECS the boundaries of its definition are different. -.- My impression was that ECS is a system where new types of component can be created and injected, and entities are just clumps of various components constructed at runtime by data.


Sort of. As far as I can tell, at least some ECS is people looking for a silver bullet encountering some data-oriented design principles, misinterpreting them, and turning their misinterpretation into dogma. As you say, though, the definition depends on who you're talking to.

Composition is a feature of almost every language and the core pillar of OO (and ecs, with limits...)

An "Entity/Component/System framework" (abbreviated as ECS) means:
Entities are nothing, but can compose components into themselves.
Components are data structures that are banned from using composition themselves, but they may link to entities.
Systems contain logic that operates on specific components.

Note that this is different to an "Entity/Component framework" (often called an "Entity/Component system" as system/framework are interchangeable)... So if you're building an Entity/Component system, that is a different thing to building an Entity/Component/System system!
These is a much more vague term, but is usually the same thing but with Components and Systems being rolled together, and overall less dogma. There's usually still the same restriction that entities may make use of composition while components are banned.

Often EC-frameworks are made to mesh well with OOP, while ECS-frameworks are closer to relational database systems or procedural programming.

Common features of both kinds of framework include:
* the component members of an entity being identified only by their type, instead of name as in OOP.
* often a restriction that an entity cannot contain two instances of a single component type.
* the ability to find components by type, and to query the parent entity of a component to find sibling components by type.
* a data driven template/prototype system, that allows entity "classes" to be created from data files.

As a fun straw-man exercise... :D
To an OO programmer, adding ECS dogma to your coding style while remaining on OO land would be similar to this:
* Classes must be commented as an Entity type or Component type.
* Classes labelled as Component can only contain primitive types, or pointers to an Entity type class.
* Classes labelled as Entity can only contain pointers to Component type classes.
* Classes labelled as Entity must name their members the same as those member types - and as a consequence, there cannot be two members of the same type.
* No class may contain member functions or any logic.
e.g.


struct Health { float hit_points; }; // component
struct PoisonSpell { Player* target; float time_left; }; // component
class Player // entity
{
  Health* _Health;
  PoisonSpell* _PoisonSpell;
};
void PoisonSystem( float dt, Player* begin, Player* end )//system
{
  for( Player* p = begin; p != end; ++p )
  {
    PoisonSpell* s = p->_PoisonSpell;
    if( !s )
      continue;
    s->time_left -= dt;
    if( s->time_left <= 0 || !s->target )
    {
      p->_PoisonSpell = 0;
      delete s;
    }
    Health* h = s->target->_Health;
    if( h )
      h->hit_points -= dt;
  }
}

Sure that's a valid style that you can choose to program in, and by default in C++ it's not at all dynamic (my entity definition above is static code)... so if you like this style you would either have to build a big ECS framework to actually allow dynamic entity descriptions, or just use another language like Lua where you could write in this style natively without the need to build a framework at all! Personally, I'm a fan of doing the latter -- write code that's idiomatic to he language and forgo large frameworks of any kind.

I have read that there are some obscure custom compilers for things like DSPs and microcontrollers that don't support it

That isn't the real problem with them. The main problem is that they just don't work the way most people think they do, and it's fairly trivial to break them in larger build systems.

And part of _that_ problem is that pragma once is not well-defined. Namely, how does the compiler decide that a header has already been included? Is it the file's inode? Relative path name? Absolute path name (including symlinks? multiple mount points? etc.) ? What about a file that is accessed through two separate paths that go through two separate virtual filesystems (like an NFS mount to the localhost and the direct local path? yes, there are systems setup like this, and can be relevant to games on Linux-based CI systems). This also comes up on some weirder systems (not relevant to games) that don't have directory structures similar to the POSIX/Win32 model.

Basically, it's impossible to make pragma once work reliably the way people expect. There simple isn't a way to uniquely identify a file across all possible file systems, platforms, and weird things some platforms can do with reparse or mount points or network filesystems and so on.

The compiler differences do come up. For instance, porting code between GCC and MSVC when using precompiled headers on both can be a little tricky specifically because of the different rules each have surrounding #pragma once. Those differing rules also make it impossible to standardize, since if it were then one or the other would be required to change behavior and break existing code.

There are active official proposals to include #pragma once or a variant #once in the C++ standard. If they are accepted they would like use the #pragma once version because of the existing universal support.

#pragma once will never be standardized since it is inherently broken as described above.

The #once proposal is basically just a syntactic shortcut to include guards: the #once syntax requires an additional token (just like an include guard) to identify the file which is intended to be unique (though it could be used to ensure that only one of a particular set of files is included by sharing the identifier between them). "#once foo" essentially just wraps the file contents in "#ifndef FOO #define FOO #endif".

And even that's super unlikely, because Modules will kind of make traditional headers obsolete anyway.

... all that said, I just use #pragma once myself. :p

Sean Middleditch – Game Systems Engineer – Join my team!

This topic is closed to new replies.

Advertisement