Adventures in data-oriented design – Part 3c: External References

In the last installment of this series, we talked about handles/internal references in the Molecule Engine, and discussed their advantages over raw pointers and plain indices.

In a nutshell, handles are able to detect double-deletes, accesses to freed data, and cannot be accidentally freed – please read the previous blog post for all the details.

This time, we will introduce IDs or external references, which also allow us to move individual items around in memory without anybody noticing. But why would we want to do that?

The answer is quite simple: Molecule’s world/unit-management uses an entity-component-system, where each entity is defined by the components it consists of. Components are small, orthogonal pieces of data, and individual systems are responsible for handling the corresponding components. All of the components are simple PODs, and do not contain any logic or code at all; all code resides in invidivual systems.

As an example, a SkinnedMeshComponent stores various handles that refer to the mesh data, to a double-buffered vertex-buffer that contains the skinned data, and to shaders and a material. It does not contain any code, and can be mem-copied at will.

The SkinnedMeshSystem is responsible for handling the components: it owns all the components, creates and destroys them, and updates and renders them.

So why do we need to shuffle individual items in memory?

  • Creating and destroying components should be an O(1) operation. If not, performance could severely deteriorate after playing for several hours, or when creating lots of components.
  • Iterating over all components in order to update them should cause as few cache misses as possible. Therefore all items should be packed densely and contiguous in memory, like an array.

Clearly, those two requirements are at odds with each other. We can create and destroy components in O(1) by using a free-list, which solves our first problem. But then we cannot iterate over all components in a cache-friendly manner.

On the other hand, using a simple array fulfills our second requirement, but creating and destroying components becomes an O(N) operation if we don’t use any auxiliary data structures.

A custom data structure

As with so many progamming problems, an extra indirection will do the job.

Instead of using either a free-list or an array, we will in fact use both: a sparse array managed by a free-list, and a dense array that holds the individual components.

The sparse (outer) array stores indices into the dense (inner) array, and a generation for each item. This generation is our monotonically increasing integer that was introduced last time. The dense array stores the items contiguously in memory, just like a regular array.

Components in such an array are not referenced by handles, but by so-called IDs. An ID stores an index into the sparse array, along with the generation at the time of creation.

Let’s detail the performance characteristics of operations that need to be performed on such a data structure:

  • Creating a component consists of adding a new component at the end of the dense array (O(1)), and allocating a new slot in the sparse array using an in-place free-list (O(1)). The slot stores an index into the inner array, and the user holds on to an ID that is a „smart index“ into the outer array (O(1)).
  • Destroying a component means that the slot depicted by the ID needs to be returned to the free-list (O(1)), and that the component needs to be removed from the inner array. This can be done in O(1) by simply swapping the removed component with the last in the array, properly updating the corresponding index from the outer to the inner array as well.
  • Accessing a component causes an additional cache-miss because of the extra indirection: we first need to access the sparse array in order to fetch the index, and retrieve the component from the inner array using that index.
  • Iterating over all components is a simple for-loop over the dense array, causing the least number of cache-misses possible.

As can be seen, creation and destruction is O(1), and iterating over all components is like iterating through an ordinary array. The only drawback is the extra cache-miss for each external access to a component. Arguably, accesses from outside the individual component-systems should be far fewer than internal data accesses each frame, therefore such a data structure should be a win in almost all situations.
Additionally, if you’re careful, look-ups for certain components can be done once a frame, and the pointer to the underlying component can be cached for one frame. I would not recommend doing this, but it can be beneficial if you really need to access a thousand individual components several times each frame.

The Bitsquid Engine uses a similar data structure called the packed array, but my implementation differs from Niklas’.

Conclusion

In Molecule, I use IDs for referencing all kinds of components. The user only ever sees the ID, and nothing else. Hence, IDs share all the benefits with handles, and have some additional ones as well, especially in conjunction with an entity-component system: you do not need a base class for your components, because an entity can simply store an array of IDs/integers. This creates even less coupling between components and entities.

This was the last installment in the mini-series about ownership and data references, hope you enjoyed it!

Advertisements

78 thoughts on “Adventures in data-oriented design – Part 3c: External References

  1. Hi, just a question :
    when you say ” because an entity can simply store an array of IDs/integers” .
    In entities, don’t you need to store along each component ID an “system ID” to know what system the component ID refers to ?

    • Oh, yes, there is other information that is stored as well. In addition to the component IDs, an entity also stores the component types in a separate array. The type is a simple uint16_t (I assume there won’t be more than 65536 distinct types).

      • Hi Stefan, there’s one thing I don’t understand:

        When you attach a component to an entity from code, you can lookup the responsible world and system via templates based on the TypeID — which is straight-forward, and at compile time. However, at run-time, how do you destroy an entity together with all its components, without some kind of lookup-table that maps TypeID-s to systems?

        I mean, an entity can have N different components, and when I simply say “destroyEntity()”, I’d like to have all its attached components destroyed as well within each System (well, delayed to the end of the frame, but that doesn’t really matter). The only thing I can imagine that would handle this is a huge switch statement, since I can’t pass run-time variables (the uint16_t component types within the array) as template arguments.

        May I ask how you handle this?

      • When you attach a component to an entity from code, you can lookup the responsible world and system via templates based on the TypeID — which is straight-forward, and at compile time.

        Exactly. This is done by using my own traits, which is a base template along with template specializations for the world and system classes.

        However, at run-time, how do you destroy an entity together with all its components, without some kind of lookup-table that maps TypeID-s to systems?

        Short answer: by using a static table of function pointers.

        Long answer: I have a compile-time construct that assigns a unique integer to each type, e.g. ComponentIdToInt<MeshComponentId>::Value would yield 1, the value for PointLightComponentId would yield 2, and so on. So whenever a component gets added to an entity, I store this value as an integer along with the component. Additionally, I have a static array of so-called destroy functions, which take a type by template argument, and a world along with a component ID:


        template <typename IdType>
        static void DestroyComponent(World* world, uint32_t componentID)
        {
        const IdType typedId = union_cast<IdType>(componentID);

        // overload resolution will automatically select the correct function to call for this type of ID
        world->DestroyComponent(typedId);
        }

        The function casts the integer back to the original ID type, and then calls the World’s DestroyComponent method. This automatically calls the correct system, using the same compile-time traits mechanism that is used for adding a component. All the destroy functions for all the component ID types are stored in a static array, so removing or destroying a component just fetches the corresponding destroy function from the array (O(1) lookup), and calls the function (function call via function pointer).

      • Thank you, the array of function pointers explains it all! I assume in this case, you start the ComponentIdToInt counter from zero (although you said “The value of the integer does not matter” earlier).

        Sorry, I don’t mean to ask too many questions, but out of curiosity: Do you defer the destruction of entities and components – do it after all systems have finished with their work?

      • I assume in this case, you start the ComponentIdToInt counter from zero (although you said “The value of the integer does not matter” earlier).

        Yes, the counter starts from zero.

        Do you defer the destruction of entities and components – do it after all systems have finished with their work?

        Yes, because it makes certain tasks easier, even though it needs a bit more memory.

  2. Great post once again! 🙂 You mentioned that your implementation is different than Niklas’, do you mind share insights about this? Of course Niklas’ post about packed arrays doesn’t have generation / counter implementation, but is there something else that is different. And since the devil is in the details, how do you find the slot for the last object that needs to be fixed to point right location in remove scenario? In Niklas’ post there is undefined Object struct that is holding this index value which is of course quite inconvenient.

    • Thanks Jani!

      For finding the slot of the last item in the dense array, I use an additional m_denseToSparse array. As the name implies, it is used to lookup the sparse slot, given a dense index. Niklas’ implementation also has that, but it’s stored inside his Object struct. That’s one difference between Niklas’ and my implementation.
      The other one is that I use a free-list that fills holes as soon as possible, whereas his implementation uses a FIFO free-list.

  3. Hi Stefan, I’ve a couple of questions:
    1) What kind of memory layout you chose for storing components data inside the systems?AoS, SoA, mixed?
    2) How are the components accessed by outside the systems? Is there a wrapper object holding an ID to the actual component (which dispatches calls to the corresponding system), or you preferred a more system-centered approach (like the handles one)?

    • 1) Generally speaking, most data in the engine is laid out in SoA fashion, because 95% of the time it is better in terms of cache-friendliness. But that is decided on a case-by-case basis. Data that is accessed together is stored together. In the case of components, it is sometimes mixed, depending on the components and what the system has to do with them. As an example, rendering-related data of a component is stored in an AoS layout, but AABBs are stored separately. Culling only needs the latter, rendering needs all of the former.

      2) You access components by using the ID. There is no additional wrapping going on. Using the ID, you lookup the corresponding components and then do whatever you want to do with them :). Maybe a code example makes things clearer: assume we created an entity with a DirectionalLightComponent. Upon creating the entity, we get an ID that uniquely identifies that entity, e.g.:

      const game::EntityId entityId = world.CreateEntity();

      Using that ID, we can lookup the actual entity, and then retrieve components from it:

      game::Entity* entity = world.LookupEntity(entityId);
      graphics::DirectionalLightComponent* component = entity->GetComponent<graphics::DirectionalLightComponent>();

      Hope that clears things up!

      • Hi, thanks for the answers, I’ve got a last one doubt regarding the 2):
        in such a design, how do you prevent the users from keeping a raw pointer to the actual entity/component instance in order to not end up with a dangling pointer somewhere somehow? A simple “never save that pointer for later use” notice should do the job? The question arises because the temptation to keep the raw pointer is rather high from the point of view of the user, so, why not just eradicate that possibility by using a C-style API approach? E.g.:

        const game::EntityId entityId = world.CreateEntity();
        world.set_entity_position(entityId, Vec3(0, 0, 0));
        world.set_whatever(entityId, …);

        Thank you very much!

      • I do not (and cannot) prevent the user from storing a raw pointer with this design, that’s true. And the approach you described is perfectly valid: never return the pointer to the user, use IDs everywhere.

        However, I like to give the users full control over everything, and clearly state what the limitations are. If you get a pointer to an entity or component, the documentation for those functions (and also the high-level docs) clearly state that all those pointers are only valid for one frame. So if somebody needs to do a lot with the same component during a frame, and all the lookups start to create noticeable overhead, it is valid to get a pointer once, and cache it for the rest of the frame.

        In the scripting language I would only offer a C-style API, like you described. There the lookups aren’t the dominating factor anyway.

        This gives you full control in C++, and full safety in scripts, which is a nice trade-off I think.

      • Hi Stefan, you mentioned that you store an array of component Ids and an corresponding array of component types with each entity. How do you map the type T from “entity->GetComponent()” to that array of component types? Is there some sort of custom RTTI involved?

        Btw, great series!

      • No custom RTTI is involved, nothing too sophisticated.

        All we need is an integer that uniquely identifies a type. The value of the integer does not matter, and it also does not need to be consistent between different compiled versions. I use a nifty template that gives me an unique integer for every type I feed into it – in the same executable it will always yield the same integer for a type, and all distinct types will yield different integers.
        Storing a component at an entity then becomes:

        m_components[m_componentCount] = union_cast(componentID);
        m_componentTypes[m_componentCount] = TypeToId<T>::ID;

        TypeToId is the class template I use for that purpose, TypeToId<T>::ID yields a unique integer for every type T.

        When retrieving a component T you can simply search the array of component types until you find TypeToId<T>::ID, and return the corresponding component, which is cast into T.
        And that’s about it.

        I might write a post about the entity-component-system in the future, because I find it rather elegant how it deals with generic components without a common base class, and only stores simple integers internally.

      • Just curious. If your components are all stored in highly efficient IDArrays contained within the systems that operate on them, how do you retrieve a component pointer from your entity class?

        Is there some kind of mapping to what system holds each component or are you providing each entity with an updated pointer each frame?

      • Entities only store IDs to their components, so if you have an entity you can retrieve certain component IDs from it, e.g.

        BloomComponentId bloomId = entity->GetComponentId();

        Using that component ID, you can get a pointer to the component by asking the world for it, e.g.

        BloomComponent::Soa bloomComponent = world->LookupComponent(bloomId);

        Looking up a (pointer to a) component in the corresponding ID array held by the world/system involves an extra indirection, but that is negligible. You can generally store individual component IDs for as long as you want, they will automatically realize if the corresponding component has been discarded in the meantime. You can also hold on to pointers to components during a frame, but need to grab the pointer at least once a frame because the elements in the ID array may have changed their order.

        The mapping between what system holds which components (and which component IDs) is done via type-traits at compile-time. The mapping between worlds and systems is also done at compile-time, so it all really only boils down to asking the ID array for the component of a certain ID, which involves one indirection compared to a raw pointer access.

      • Hi Stefan,

        while reading some of your replies and comparing your strategies with those in mine engine, I’have found some techniques very interesting but not totally clear to me… reading this code example:


        game::Entity* entity = world.LookupEntity(entityId);
        graphics::DirectionalLightComponent* component = entity->GetComponent();

        It seems that you have an Entity class instance stored somewhere that offers function utilities like GetComponent, so I imagine that in your engine an entity is not only a mere ID but your “EntityManager” has an array of Entity instances that can be map via ID and those instances have arrays capable to store “bounded” components IDs, so you can return a pointer to the desired instance… Am I right?

        Proceeding in that direction and focusing on the second line, you are able to return also a pointer to a component type, now, as you state (and like in my case), most of the systems store its component data in a SoA manner so obviously you cannot return a pointer to that, that means (correct me if I misunderstood) that your systems also store an array of ComponentType instances ready to be return as pointer to the user like the Entity case above that offers some function utilities.

        I’ve found also very interesting this sentence:


        The mapping between what system holds which components (and which component IDs) is done via type-traits at compile-time. The mapping between worlds and systems is also done at compile-time



        that is really, really, really interesting… In my engine I’m using hash maps (where the keys are integer representing a type similar to yours TypeToId::ID) to map all these things, but I’m not very happy with it and so I’m really curios about how you have achieved a compile-time mapping.

        Last but not least, in your code examples you use very often the union_cast like here in the comments (or in other posts):


        m_components[m_componentCount] = union_cast(componentID);

        I know that, in theory, a union cast is very dangerous and a lot of programmers say horrible things against type punning using unions and prefer to use reinterpret_cast, so I’m just curios on why are you using that cast and what is the reason behind your choices? just my pleasure of learning :)


        – DC

      • It seems that you have an Entity class instance stored somewhere that offers function utilities like GetComponent, so I imagine that in your engine an entity is not only a mere ID but your “EntityManager” has an array of Entity instances that can be map via ID and those instances have arrays capable to store “bounded” components IDs, so you can return a pointer to the desired instance… Am I right?

        Yes, the EntityWorld retrieves a game::Entity* using a given ID. An Entity stores individual components by storing their ID and type in an array.

        Proceeding in that direction and focusing on the second line, you are able to return also a pointer to a component type, now, as you state (and like in my case), most of the systems store its component data in a SoA manner so obviously you cannot return a pointer to that, that means (correct me if I misunderstood) that your systems also store an array of ComponentType instances ready to be return as pointer to the user like the Entity case above that offers some function utilities.

        No, the systems do not store that. The pointer returned here is due to this blog post being written before all the SoA systems were in place. Currently, all systems store their data in SoA manner, so what is really returned here is of type graphics::DirectionalLightComponent::Soa. See this post for more details.

        The mapping between what system holds which components (and which component IDs) is done via type-traits at compile-time. The mapping between worlds and systems is also done at compile-time



        that is really, really, really interesting… In my engine I’m using hash maps (where the keys are integer representing a type similar to yours TypeToId::ID) to map all these things, but I’m not very happy with it and so I’m really curios about how you have achieved a compile-time mapping.

        Well, the mapping is a bit more complicated, and definitely grants its own post :). The mapping is done using several helper templates which are specialized for each distinct component type:

        // helper template that yields the corresponding component type to any ID type at compile-time via an inner typedef named Type.
        // specialized in other modules (graphics, game, ...)
        template <typename T>
        struct ComponentIdToComponent {};

        // helper template that yields the corresponding component ID type to any component type at compile-time via an inner typedef named Type.
        // specialized in other modules (graphics, game, ...)
        template <typename T>
        struct ComponentToComponentId {};

        // helper struct that defines the ID type for a component's init struct in an inner Type typedef
        template <typename T>
        struct ComponentInitToComponentId {};

        // helper struct that defines the corresponding system for each component ID in an inner Type typedef
        template <typename T>
        struct ComponentIdToSystem {};

        // helper struct that defines the corresponding world type for each component ID in an inner Type typedef
        template <typename T>
        struct ComponentIdToWorld {};

        Using those templates, I can e.g. deduce which System in which World holds the components for a given ID type. As an example, if we want to lookup a component with a type of DepthOfFieldComponentId, the code knows that the component type we’re looking for is DepthOfFieldComponent, which lives in the PostProcessingSystem inside the RenderWorld.

        The internal call itself is just the following:


        typedef typename componentTraits::ComponentIdToWorld<IdType>::Type WorldType;
        return GetComponentWorld<WorldType>().LookupComponent(id);

        GetComponentWorld() is a specialized template function that returns the instance of RenderWorld. Now that we have the world, RenderWorld::LookupComponent does something quite similar:


        typedef typename componentTraits::ComponentIdToSystem<IdType>::Type SystemType;
        return GetComponentSystem<SystemType>().LookupComponent(id);

        Again, GetComponentSystem() is a specialized template function that returns the instance of the system responsible for storing this kind of component, in this case the PostProcessingSystem. PostProcessingSystem::LookupComponent() finally returns the component we were looking for.

        Note that all these outer calls are simple one-liners, just returning a member. This means that this chain of lookups actually turns into something like m_renderWorld->m_postProcessingSystem->LookupComponent() when being compiled. And all this does is make a lookup into an ID array, meaning it’s very fast (a direct array lookup with one extra indirection).

      • Answering this one here separately because the previous response was already quite long:

        I know that, in theory, a union cast is very dangerous and a lot of programmers say horrible things against type punning using unions and prefer to use reinterpret_cast, so I’m just curios on why are you using that cast and what is the reason behind your choices? just my pleasure of learning :)


        It’s funny how this question comes up time and time again :). The thing is: type punning is bad, breaking the strict aliasing rule is bad, but sometimes there aren’t really any other options. Let us assume you find yourself in a situation where you absolutely have to use type punning. What are your options? You have several:

        1. Use a reinterpret_cast.
        2. Use a union. I have defined my own union_cast for that purpose.
        3. Use a char* for accessing individual bytes.
        4. Use memcpy/std::memcpy.

        Let’s say you have a float, but really want to interpret its bits as an int. Using a reinterpret_cast, you would take a pointer to the float, reinterpret_cast it to int*, and dereference that. In code:


        float someFloat = ...;
        unsigned int intBits = *(reinterpret_cast(&someFloat));

        However, this breaks the strict aliasing rule. E.g. when writing code like this, GCC emits a warning “dereferencing a type-punned pointer breaks the strict aliasing rule”, and rightfully so. In an optimized build, the code will not do what you want it to do, because you didn’t play by the rules.

        If you were to do the same using a union, it would look like this:

        union
        {
        float as_float;
        unsigned int as_uint;
        };

        as_float = ...;
        unsigned int intBits = as_uint;

        This achieves exactly the same as the code above, but works on all compilers. Note that it is not strictly allowed by the standard, because you are only allowed to read the field of a union that has last been written to. In other words, we would only be allowed to read from as_float, but not from as_uint. In practice this does not really matter though, because this is supported by all compilers, and used in a lot of places in code all over the world. If any compiler would not support this, many, many applications would break.

        Nevertheless, you do have standards-compliant options, and those are either using memcpy, or accessing everything using a char*. The reason for that is that a char* is allowed to alias everything, and hence reading from a char* does not break the strict aliasing rule.

        Why I don’t use memcpy? Because at least one compiler I work(ed) with is unable to emit a simple load instruction in even the simplest cases such as this, but rather emits a call to memcpy for copying 4 bytes. I’d rather use a union then.

      • In the template part, you have lost me especially with ComponentInitToComponentId 😦

      • First of all, I fixed the template brackets which WordPress ate again. Sorry about that.
        Second, the ComponentInitToComponentId is something I use for creating/initialising components. You don’t need it to understand how looking up components works.

  4. Btw. how do you call this custom data structure? Is it template, or simple class to inherit by all systems or is it more like a container? By myself I have followed Niklas’ foot steps more or less, and called it as a packed array (template) with an interface with contains: void_t Reset(), ObjectID Create(), void_t Destroy(ObjectID id), Object* Get(ObjectID id), bool_t Get(ObjectID id, Object* ptr), uint16_t GetCount() const, bool_t Contains(ObjectID id) + ctor with capacity to make it as a fixed.

  5. Hello, Stefan !
    (Sorry for the stupid question, I’ve just wanted to clarify a bit.)
    Are your internal arrays of components (which are stored inside their corresponding systems) resizable ?
    Or do you set the limit on the amount of components of certain types depending on the game?
    (IIRC, Niklas uses non-growable, fixed-size arrays.)

    • The arrays of components are non-growable, but their size can be set by the user. It mostly depends on the entities stored in the resource package, and of course how many components are spawned at runtime.

  6. This is a little bit off-topic, but since you are talking about ECS here and you mentioned that your components are PODs without any base class etc. How do you expose component properties to editors? “Usually” this is done using some base class for component which provides interface to register property values etc. I would imagine that you could do this in component system level and same time you would define nice C-style interface that could exposed to scripting side. Am I right?

    • Almost, it’s done a bit different in Molecule.
      The entity-component architecture uses external descriptions for all its components, stored in so-called “Schema” files. Schemas and their properties/fields are treated completely generically, and hence are used for exposing properties to the editor. Additionally, schemas have a few more other advantages over serialization- or reflection-based approaches, which is what I wanted to talk about in a future post.

  7. Yes, this kind of metadata based approach makes things somewhat easier to maintain and extend ECS, e.g. you can describe new components completely with data. This is really convenient if you also happen to use scripting language for logic – you don’t have to write C/C++ code at all for new game logic 🙂 Of course it is still important to be able to define components / component systems in native language due to possible optimization needs.

    Also if you have separated engine from editors / other tools and communication is done via TCP you will gain some benefits from this metadata approach. For example editors can directly parse metadata to get information about exposed properties without any engine queries (easy to build UI for custom components). Besides this entity “compiling” into binary blobs becomes easier compared to more traditional approaches etc.

  8. Quicky here 🙂

    Since your Freelist implementation override memory when you return a previously allocated memory (reusing memory it for intrusive linking) do you store the monotonically increasing number in a separate array inside your IdArray and at retail builds #undef it?

    This means IdArray has 4 arrays, the handles, the generations, the denseToSparse translation and the objects itself right?

    • This means IdArray has 4 arrays, the handles, the generations, the denseToSparse translation and the objects itself right?

      Absolutely right!
      In retail builds, no checks on the generations are done, hence the array holding the current generations doesn’t exist.

  9. In your ECS, How do you handle Systems which use multiple components? Like the render system for example, it might need both a ‘RenderableComponent’ and a ‘PositionComponent’. Can it somehow get the entity id from the renderable component, and use that to ask for the position component?

    • Can it somehow get the entity id from the renderable component, and use that to ask for the position component?

      No, that would be much too cache-unfriendly.
      Most of the time, data that is needed together is stored with the component that needs it in a SoA-fashion. For example, all MeshComponents have a 4×4 transformation matrix, and so do SkinnedMeshComponents. The renderer/system responsible for rendering can access that read-only data from the systems which store the components’ data.

      When data spans multiple systems, one of the systems simply prepares all the data for another system, and returns it from its specific Update()-method. Because I don’t use generic (virtual) Update()/Render() methods, I can pass along any parameters I want, and be very explicit about the ordering of things, even cross-system dependencies. So for example, for animation data somebody has to prepare the matrix palettes for skinning first, and then skin the individual components using that data. Additionally, root motion needs to be extracted and applied to the transformation matrices. As you can imagine, all that data is handled by different systems.

      In such a case, my code looks like the following:
      const AnimationSystem::UpdateData animationUpdateData = m_animationSystem.Update(deltaTime);
      m_skinnedMeshRenderSystem.Update(deltaTime, animationUpdateData.m_matrixPalettes);
      const math::matrix4x4_t* transformations = m_skinnedMeshRenderSystem.ApplyTranslationDeltas(animationUpdateData.m_rootMotionDeltas);
      m_animationSystem.PostUpdate(transformations);

      So every system really only has one responsibility, and if the resulting data is needed by other systems, I simply pass along a contiguous block of data that holds exactly that data – nothing less, nothing more.

      • Hey =}

        How do you handle mappings of entries from different systems? I’m assuming you have a table that keeps track of entry X in the animation system that corresponds to the entry Y in the rendering system?

        Maybe the code you provided above was simplified (no ids) etc? 🙂

        Great answers!

      • Nope, the code is verbatim engine code.
        There is no table or other mapping, because systems are only concerned with one kind of component, and know how to provide the data for other systems (if they even have to). E.g. a physics system would prepare the returned data in a way that the render system can just walk through the memory block contiguously, updating transformations on the corresponding mesh components.

        Mappings are all done at a higher level via IDs, an entity is woven together by different components having unique IDs.

      • Hey again 🙂

        Thanks for the reply! However I think I’m missing a detail 😦 What I observe is that NOT all entities need to have physics components in the engine right? Which means you will have more entries certain systems than others such as in the rendering system. How would the physics system know how the rendering system expects the data since we now have more data in one than the other which means we can’t have direct 1 to 1 mapping (index to index)?

        Besides that, each system might want to sort its data in its most optimal layout for processing? Again this gives different ordering and breaks the 1 to 1 mapping.

        Just an observation :>

        Thanks!

      • Yes, that’s true.
        This mapping is resolved by making use of the IDs stored in the higher-level systems/abstractions, such as the component-IDs stored along with the entities. For components that are required to have counterparts (such as physics components always requiring a rendering component), you can just store the rendering component ID along with each physics component. This means you have another indirection either when writing component data (e.g. outputting data from the physics system), or when reading data (e.g. applying transformations calculated by the physics system to the corresponding rendering components). I don’t think you can have linear accesses everywhere, you need to make a trade-off somewhere.

      • Hey again 🙂

        Yea exactly, that’s what I meant with the table thing mapping one system to the other. But in your case you don’t have an explicit table but your entity holds the components IDs so when it’s time to update entity X we can easily map system entries to other entries. However there is also the case where you might have different sub entries. Entries would be component to component and sub entries in this case would mean transforms to bones. When I mean sub entries I’m talking about 100 transforms but 50 of them are bones. So your renderer expects 100 transforms but 50 of them will come from the animation system.

        I assume there is something like this going on somewhere?
        for each bone_to_xform[bone_id] xform_id where the entity is holding this mapping I assume. This mapping can of course be generated during the content compilation process in the pipeline 🙂

        Am I right?

        Thanks a lot!

      • Sorry for the late reply, Yasin.

        Entries would be component to component and sub entries in this case would mean transforms to bones. When I mean sub entries I’m talking about 100 transforms but 50 of them are bones. So your renderer expects 100 transforms but 50 of them will come from the animation system.

        Hmm, no, I can’t think of any situation where I have something similar to what you call sub-entries in the engine. For animations, it’s the animation system’s job to update the joint transformations according to the skeletal hierarchy, which is taken by the skinning system to generate the skinned vertices. The renderer is only interested in mesh transformations, and if you were to use a GPU-skinning approach, the data would already be prepared appropriately so that you have a separate data stream that contains the matrix palette and joint transformations you need for skinning.

        I don’t quite see why e.g. the renderer would be interested in such things. That should be the concern of systems dealing with the data, which the renderer shouldn’t know anything about (other than the resulting output).

    • Hey Stefan 🙂

      I think there was a slight missunderstanding, might be due to my poor explanation.

      What I meant was: While the entity takes care of the mapping from one component type to the other component type from different systems (Such as the one case you explained with PhysicsComponent to MeshComponent at a higher level), I was thinking there might be the same thing needed for the individual items (entries) within the components like transformation matrices?

      For example the animation system might not have the entire hierarchy of transformation because not all matrices are animated and we would want to keep the data as small as possible, while the renderer expects the entire hieararchy of matrices. So the data can’t be simply handed over from animation system to the rendering system without some extra mapping going on at a higher level?

      Think of it as each system has different index buffers for iterating through the data in their own preferred way. The renderer might want to loop over each individual mesh in this hierarchy and submit them with a specific ordering while the animation loops over the data in completely different ordering (perhaps based on time samples or some other rules). This kind of approach would make the data in the systems preferred ordering and the next system wouldn’t know how to take that data straight away and apply it on the correct parts (such as meshes).

      If the data is not looped over in different ordering and if they are not sized differently then I guess simply handing it over to the next system in the chain is a valid approach because the hierarchy would be mapped 1 to 1. 🙂

      This kind of mapping could be done in the pipeline and the entity would hold it I assume?

      Cheers 🙂

      -Yasin

      • Ah, that’s what you meant.
        To be honest, until now I haven’t run into such a situation. In the case of animations, meshes can either be animated/skinned, or not. If you want to have parts of a model skinned, but not other parts, you have to divide the model into 2 distinct meshes (which in my experience is what artists do anyway).

        The renderer makes a clear distinction between non-animated meshes and animated meshes. And so do the underlying systems, hence the mapping is always a 1:1 mapping in that case:
        -) The renderer renders all non-animated meshes, each mesh having its own transform.
        -) The animation system only touches N animated meshes, updating their transforms.
        -) The renderer takes those N animated meshes and their transforms, and renders them.

        Is that what you meant?

      • Hey Stefan 🙂

        Yea your answer answered my question so I understand perfectly now 🙂

        Thank you!

        Do you internally sort the transforms (in the animation system) for looping over them in different ordering, based on time samples instead of hierarchical chain for example? This was just an example, but the point is; having different sorting “might” be more efficient for that specific system when it is time to run the mega update instead of the default skeletal hierarchy right? Obviously then the animation system would need to keep track of mapping it back so that outside systems can take it back straight away so the animation system would store a translation table? 🙂

        Another example might be that the physics system loops over each “part” not in the skeletal hierarchy order but in maybe in the spatial data structure order? This would mean that we have to store the data as efficient and tightly packed for that ordering which again would need to be translated and mapper back before handing it over to the rest of the system pipeline.

        Do you do this kind of stuff or plan to do? Just wondering 🙂

        This is what my index buffer analogy was about 🙂

        Another question I have is, what is a mesh in your engine? Is a mesh a single part in a bigger group most people call a “model”? If no does this mean you don’t support “sub-parts” that have not been “combined” in Maya / Max which would mean all the subparts are collapsed into one single “mesh” in your file?

        Thank you so much for your awesome blog, me and many others are learning a lot 🙂

        -Yasin

      • Do you internally sort the transforms (in the animation system) for looping over them in different ordering, based on time samples instead of hierarchical chain for example? This was just an example, but the point is; having different sorting “might” be more efficient for that specific system when it is time to run the mega update instead of the default skeletal hierarchy right? Obviously then the animation system would need to keep track of mapping it back so that outside systems can take it back straight away so the animation system would store a translation table? 🙂

        Another example might be that the physics system loops over each “part” not in the skeletal hierarchy order but in maybe in the spatial data structure order? This would mean that we have to store the data as efficient and tightly packed for that ordering which again would need to be translated and mapper back before handing it over to the rest of the system pipeline.

        Do you do this kind of stuff or plan to do? Just wondering 🙂

        This is what my index buffer analogy was about 🙂

        I know what you mean, but no, I don’t do that anywhere in code yet. Hierarchical data is flattened so that linearly accessing it is either depth-first or breadth-first (whatever you need), that is also described in one of my blog posts.
        If I have to do that somewhere, having something like an “index buffer” (just another indirection) is the way to go, as long as the data is stored in an optimal way for the system that touches it.

        Another question I have is, what is a mesh in your engine? Is a mesh a single part in a bigger group most people call a “model”? If no does this mean you don’t support “sub-parts” that have not been “combined” in Maya / Max which would mean all the subparts are collapsed into one single “mesh” in your file?

        A mesh can have several so-called triangle groups, and every such group can have a different material, among other things. Triangle groups are “sub-parts”.

        Thank you so much for your awesome blog, me and many others are learning a lot 🙂

        Thank you!
        I really appreciate that, coming from someone like you! I hope the blog is interesting and inspiring to others.

  10. Hi Stefan.

    Sorry to answer this post after so much time, but I just discovered your work.

    There’s a point I would like to clarify, concerning component ownership. You’re giving an example with the SkinnedMeshComponent, arguing that the SkinnedMeshSystem owns the SkinnedMeshComponent’s while manipulating them.

    Do you mean that a single system can manage a single type of components ? I know that this is a bit off-topic and not directly related to the handle system, but it could help me out understanding the relation/communication, if any, between components of different types.

    For instance, I was expecting that something called a RenderSystem should access a ModelComponent and a TransformComponent to know what to draw and where. But it does’nt seem to be very compliant with the cache-friendliness you’re trying to achieve.

    So how do you deal with that ?

    • Do you mean that a single system can manage a single type of components?

      Some systems also deal with several types of components, e.g. the LightingSystem knows about all kinds of light sources and their components like point lights, directional lights, etc. But in general, yes, a system stores one type of components.
      Regarding your other question about cache-friendly access to different components, see my reply to darksecond’s comment.

      • Yes and no.
        If you take the example with the matrix palettes, they are stored in a temporary contiguous block of memory each frame. They are stored nowhere else, so they are not persistent. In that case it’s not duplicate data, but rather data stored in temporary memory each frame. That temporary memory comes from a fast per-frame linear allocator, and is used by other systems as well for putting their per-frame data in.
        Other system interactions might give need to duplicate data, such as e.g. a physics system handing data over to the rendering system.

  11. Thanks for your quick answer.

    So to sume up and to ensure that I am well understanding , you’re never reasoning in terms of a system browsing several types of components ( except for components of the same “kind” but unrelated like the light components in your Lightsystem example ) to get the data. You’re working with a system managing and using a “main” component type, and getting all needed data through inputs of the update function, these data being the result of other systems updating their own components.

    Huh…Am I clear ( and am I right above all :p ) ?

    • Yes, that’s the gist of it :). There are a few more details to talk about like how components are stored and managed efficiently, and how entities refer to components living in distinct systems, but that has to wait until a proper blog post is up.

  12. How do you handle memory for systems and components? For example, when loading a new level, it might bring in a bunch of ‘renderables’. Does the system that handles those have a set maximum of said component, or does it allocate more memory at load time? (For example, it could keep an dynamically loaded array per ‘level’).

    • That depends.
      You can either configure the engine to allocate automatically based on how many resources are contained in a resource package, or set upper limits for different types of components. For dynamic allocations, you can still use a stack-based allocator where you can throw away everything after a certain level or package has been unloaded.

      • No, each array for a component type is stored once in the system taking care of that component.
        The systems use ID-arrays for component management so they can store the data in whatever order they need. Access from “outside” (higher-level engine parts) only happens using IDs.

  13. Pingback: Using runtime-compiled C++ code as a scripting language: under the hood | Molecular Musings

  14. Hi Stephan,

    Regarding your type Id generation you said in an early comment:
    TypeToId is the class template I use for that purpose, TypeToId::ID yields a unique integer for every type T.

    I’ve been looking for a compile time integer generator on the internet but I couldn’t find anything that satisfies your usage (a lot of them use a function that increments a counter). Could you give me some pointers on how to implement (or where to find) a system like yours? Now I’m really intrigued 🙂

    Thanks!
    Marc

    • Hi Marc,

      The TypeToId mechanism works like the following:


      // one counter for all different TypeToId templates.
      // it is crucial that the struct itself is not a template, so that all TypeToId instantiations share the same static counter.
      struct ComponentTypeRegistrar
      {
      static uint32_t counter;

      template <typename T>
      static uint32_t RegisterFunctions(void)
      {
      const uint32_t index = counter;
      ++counter;

      // ...omitted..., using index here

      return index;
      }
      };

      // each ID generates its own TypeToId template, but all template instantiations use the same global counter.
      // therefore, the TypeToId::ID is a monotonically increasing value which is unique for each T.
      // the values themselves don't matter, only that they are distinct for each type.
      template <typename T>
      struct TypeToId
      {
      static uint32_t ID;
      };

      template <typename T>
      uint32_t Entity::TypeToId<T>::ID = Entity::ComponentTypeRegistrar::RegisterFunctions<T>();

      Whenever you use TypeToId<T>::ID somewhere, the compiler must compile the definition above, calling ComponentTypeRegistrar::RegisterFunctions<T> in the process.
      So you get one unique integer per type, as described above.

      Hope that helps!

      • I saw a similar technique, without using the counter, that produces id’s of type size_t:

        template <typename T>
        static size_t registerType() {
        static T* temp;
        return (size_t)&temp;
        }

        template <typename T>
        struct TypeToID {
        static const size_t Value;
        };

        template <typename T>
        const size_t TypeToID<T>::Value = registerType<T>();

      • Hi Stefan,

        Thank you! It really helped! I couldn’t figure out how to use the counter and having it assigned to the static ID variable.

        Thanks Garnold! I saw this solution, the problem is that I wanted a uint16_t ID, that’s why I wanted to figure out Stefan’s solution.

  15. I’m curious how you deal with iterating over multiple component arrays simultaneously. For example, say you have a Position component and a Velocity component. Do you simply iterate over one of them in memory order and suck-up the cost of the (essentially random) iteration over the other? Or do you sort them first?

    • Hmmm, I don’t see why Position and Velocity components would be in a different order? If components can only exist together (like in this case, I guess), then both their arrays can stay in the same order, and this can even be enforced by the run-time.

      Can you maybe give a short example that shows were that might not be the case?

      • Sorry, I omitted that (for the sake of the example) Velocity may not be present on all components. That was the case I was curious about. In my system I just suck up the cache misses, because keeping the components in a stable order is more expensive.

      • In my case, all components of the same type are handled by the corresponding system that works on all of them in bulk. There are never any holes in the memory structure that stores the components’ data. In order to be able to do that, going from an external ID (visible to the user) to a pointer incurs an extra indirection, but only happens once per frame for each component you touch in user-code (e.g. scripts).

  16. hey,
    i found your page a few days ago and now i am trying to implement some of it.
    I have some difficulty to understand how and where you put the following functions:

    LookupComponent(id);
    GetComponentWorld()
    GetComponentSystem()

    Do you put them in the “EntityManager” class like this: https://gist.github.com/scorvi/864eef45533242a095a1

    and is this how you register a component “KinematicData” :
    template
    struct ComponentIdToSystem<ComponentTypeToId::ID> {
    typedef KinematicSystem Type;
    };
    template
    struct ComponentIdToWorld<ComponentTypeToId::ID> {
    typedef KinematicWorld Type;
    };

    template
    struct ComponentIdToComponent<ComponentTypeToId::ID> {
    typedef KinematicData Type;
    };

    template
    struct ComponentToComponentId {
    ????
    };

    … it is my first time using templates so i have some problems to use them correctly ..

    • Your code on github looks fine, where do you have troubles?
      It doesn’t matter so much where the LookupComponent(), GetComponentWorld() and GetComponentSystem() functions are implemented. In Molecule, a World is responsible for looking up components, so that’s where LookupComponent() is located. A World can be a loading screen, a level, a mini-game, etc. Something that has its own entities, render world, resources, and so on.

      Furthermore, a World owns an EntityWorld and a RenderWorld, so GetWorld() is also implemented in the World itself, and returns one of its members, just like in the pseudo code you linked to. The world that is returned then implements GetSystem(), and returns the appropriate system depending on the type/ID of the component. Lastly, this system is responsible for doing the actual lookup.

      The struct ComponentToComponentId simply has an inner typedef that defines the type of the ID belonging to a certain component. All ID types (e.g. MeshID, SkinnedMeshID, ParticleSystemID) are their own types/structs. There is no generic ID type, and hence ID types can be used for overload resolution in functions, as well as distinct template parameters.

      • Interesting but why you separate World from EntityWorld, entities just could be members of World (e.g. in ID-array)? Also why you separate WorldX from SystemX e.g. RenderWorld from RenderSystem? All components are members of RenderSystem anyways, so RenderWorld has no actual functionality? So in the end you could have a World which has an ID-array for entities and and ID-array for all Systems. With a component system registry you could then easily create needed systems for each World and spawn all needed entities based on your “level”/”scene” file.

      • It’s not as easy as that.

        First, all the functionality of the engine is separated into distinct libraries, where lower-level libraries are never allowed to know anything about higher level libraries. So all the rendering functionality (RenderWorld, all the systems, backend, etc.) is compiled into one library, all things belonging to the entity-component-system is compiled into a different library. Same for physics, core library code, input, etc.
        This is kind of a self-imposed constraint, which makes you think harder about (de-)coupling of systems/libraries, and keeps dependencies between engine components small. Of course, you don’t have to do it that way.

        Second, not all components are members of RenderSystem – there is no such thing as one single RenderSystem. There are several systems, each responsible for one thing: meshes, skinned meshes, animations, skinning, post-processing, lighting, and so on. All those systems are owned by the RenderWorld. The RenderWorld ties all the systems together, and is responsible for looking up components in the respective systems, based on a component’s type.

        For me as a developer, this makes it easier to work on one thing in an isolated way, without having to touch other systems, or other data structures. I can decide how data is stored in the systems, be it AoS, SoA, or a mix of both.
        As a user of the system, I want you to be able to access components in a straightforward way, without having to know that the MeshSystem stores MeshComponents, the AnimationSystem stores AnimationComponents, and so on. I want you to have one single point of contact for retrieving a component that is part of a certain library. For the user this means that whenever he wants to access e.g. a component that is part of the rendering library, he can ask the RenderWorld about it.

      • Hi Stefan,

        “A World can be a loading screen, a level, a mini-game, etc. Something that has its own entities, render world, resources, and so on.”

        May I ask how you handle multiple worlds? Does your engine have like an array of “World”-s in a higher-level class, which manages them somehow? Also, if each world has its own RenderWorld (and other worlds), how do they share a rendering context (for example an OpenGL context)? I thought initializing that would be the RenderWorld’s (the backend’s job) – although I’m really not sure which part of the graphics module should be responsible for that – but the “multiple Worlds” concept really makes me re-think how all this should work.

        I’ve learnt a lot from your blog, please keep writing these posts! 🙂

      • May I ask how you handle multiple worlds? Does your engine have like an array of “World”-s in a higher-level class, which manages them somehow?

        This is done by the user, either in the game’s C++ code or in scripts. It’s perfectly legitimate to only have one World for the whole game, loading/unloading resources into/from this World all the time.

        Personally, I prefer to have several Worlds, where the script code for a “loader” entity kicks off loading of other Worlds. As an example, you could have a Startup World that does nothing more than load the company logo, while kicking off loading for the main menu World. Once you’ve loaded a Game World, you could have a second World for the in-game menu. This World would be resident in memory, but only update/render itself when the game is paused and the menu is open. I like to treat Worlds like independent pieces of a game that can be edited, simulated, and rendered in isolation. Makes managing of resources and objects much easier.

        Tying all of this together is the user’s responsibility, though.

        Also, if each world has its own RenderWorld (and other worlds), how do they share a rendering context (for example an OpenGL context)? I thought initializing that would be the RenderWorld’s (the backend’s job) – although I’m really not sure which part of the graphics module should be responsible for that.

        No, a RenderWorld holds resources and components that are stored in the RenderWorld’s systems, like meshes, textures, etc. A RenderWorld does not own the render backend – that belongs to the graphics module.

        To give you a better idea of how this works, I can briefly explain how you use different parts of the Molecule Engine:
        Molecule is divided into several modules, which you need to start up and shut down independently. Maybe not every game needs every module, and maybe you want to handle certain aspects of the engine (e.g. physics) yourself. Therefore, starting up and shutting down the engine usually looks like the following:


        core::Startup(...);
        input::Startup(...);
        graphics::Startup(...);

        // the main loop
        ...

        graphics::Shutdown();
        input::Shutdown();
        core::Shutdown();

        The render backend is initialized in the call to graphics::Startup(), so every RenderWorld can access the same backend. The backend API is nothing more than a namespace with several dozen functions, also owning the D3D device/OGL context.

  17. Thank you for the answer, it makes sense now! In this case, I assume the number of maximum entities / components should be configured per-world, since we don’t want to waste the same amount of memory for a loading screen world (which needs just a few entities) than for the main game level.

  18. Hi Stefan, I am extremely interested in data oriented design and I want to apply it to my physics engine. I have some difficulty on applying it on a physics sytem. I have looked at Box2d implementation and Erin Catto use intrusive linked list to store RigidBodies. The reason why he use linked list over array is stated here http://www.box2d.org/forum/viewtopic.php?f=3&t=5389#p25251 .

    When physics engine do broadphase collision detection, It will return list of colliders pair that is most likely located randomly at memory. Finally as stated by Erin Catto, when islands are crated the set of rigid bodies is likely not to be contiguous on memory.

    Using ID means there is more cache misses (relative to pointer). Using handle means we cannot do the swap with the last trick. So how do data structure like AABB tree and islands reference the table? Or intrusive linked list is better in this case? I have asked similar question on here ( http://gamedev.stackexchange.com/questions/111971/data-oriented-design-in-physics-engine )

    • I would use an array of bodies (not body pointers!), and hand IDs to the user for referencing bodies. For each collision pair, you would get two IDs of the colliding bodies.

      In my opinion, there are the following best and worst cases:

      • You have about 1-5 collisions in a frame: Great, with that few collisions, there are going to be very few cache misses caused by the extra indirection of the ID. No big deal.
      • You have about 1000 collisions in a frame: With that many collisions, the performance impact of having to resolve 1000 collision pairs will greatly outweigh the impact of extra cache misses.

      Internally, all the physics engine code will access the rigid bodies in the array directly, giving you the best performance. With ID-based systems, it is always a tradeoff between how many internal (direct) and how many external (via ID) accesses you are going to have.

      Keep in mind that you need to measure the performance of actual code in order to be sure :).

  19. Hello! I am really enjoying reading through this series of posts (albeit I am a couple years too late), but I am a bit stumped at this article. The looping through components and O(1) insertion/deletion makes sense, but how would I go about having a system that works on multiple components, such as a physics system that acts on a velocity components and position components? Wouldn’t it involve a bunch of lookups by EntityID to find the two components that belong to one entity leading to O(N^2) time?

    Anyway, thanks a lot for the great blog post, and finding the time to read through my question.

    • External references are only used when components from the “outside” want to access something stored and owned by an engine system, such as the physics system. The systems themselves always iterate and work on all their components directly and hence don’t need to work with IDs in the case of updating components.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s