image6.png

ECS 

 The design goals for our ECS was for it have as much code as possible run in parallel and to do it completely lock-less. The structure we came up with alternate between two phases, Run and Sync. We made it so that a system can take in any combination of components since we found the one system per component structure restrictive. During Run all systems run in parallel. Since this means a component can be handled by multiple systems simultaneously they can only access them as const instances. Each system instead stores its output in their own data buffer.

 

Once all systems have run the ECS can go over to Sync. At the start of Sync the outputs are collected from every system and sorted by type and target entity. These sorted outputs are sent to the target component’s sync-function. Each component type has its own user-defined sync function so the outputs can be handled as deltas, absolutes or a combination of the two. It also defines how the component behaves when receiving multiple outputs. When all the outputs have been processed the systems have their data buffers reset and the ECS is ready for the Run phase.

Syfte – Varför?

  • We determined that creating an ECS would make it possible to build our games with great modularity

  • ECS is becoming more common and it also share similarities with component-based structures that are industry standard. Since we previously had mostly worked with object-oriented programming, we felt like it would be something useful to gain experience working with.

  • We wanted to build an engine that was heavily threaded so that we take advantage of as much CPU power as possible. It would also make it easier for us to dedicate a thread for heavier systems such as AI and Rendering. The ECS structure lent itself very well to this. 

  • ECS is inherently very difficult to debug which is something we realised and discussed before implementing it in our engine (Memory Leek). We however did not come up with a way to mitigate this but concluded that we would most likely learn to handle the problem along the way.

Implementation/Structure

  • At first Oscar Åkesson[www.oscarakesson.com/post-haste] and I spent some time planning the structure and building prototypes for the ECS. Due to sickness I wrote the version that ended up in the engine. Later we both went over that version. Further along the year as things came up, we discussed structures and implementations but as Oscar focused on other tasks and I wanted to iterate on the .

  • The structure we came up was in run time split up in two parts, run and sync. During run each system run parallel and can use an arbitrary collection of components. To take advantage of as much of the CPU as possible we also ensure that it runs lockless. Since this means that a component can be handled by multiple systems at once we made the components const during run. Each system instead output the delta, or absolute state depending on component, for each component it wants to change. During sync all the outputs are collected and given to the component through its sync function. It is there all the outputs are processed and applied to the component. Afterwards the ECS goes back to run.

·         ECS
  • The main class in the ECS is the ECSManager which is singleton that functions as an interface to the rest of the ECS. It also functions as a system in the regard that it can take in an output of any component type and take it to the correct component during sync. 

  • The ECSManager also contains an instance each of the other classes that manages the ECS, which are EntityManager, ComponentManager, SystemManager and BitmaskManager.

  •   BitmaskManager stores a const type_info*  for each component type in a vector. The index in the vector is the component types ID. A bitmask of any component combination can be generated via a variadic template function. These bitmasks are what is used to identify component combinations across the ECS.

  • The EntityManager contains an array of entity entries. Each entry is a bool and a bitmask. The bool  tells if the entity is alive and the bitmask tells what components it has. The ID of the entity is it index in the array. The EntityManager can take in a bitmask and return a vector of the entities to which it applies. This is used in the systems but can also be called from outside the ECS to get a list of entities with a certain set of components.

 

  • The ComponentManager holds a C-style array of the max entity count for each component type. The reason it holds an array of that size is so that the entity only needs to hold a bit for each component instead of a pointer. The reason we constructed it this way is because we wanted to avoid a cache-miss by having a pointer in the entity and we wanted to keep the entity small. This however caused a problem when we wanted to add bigger components. What happened is that we added our particle emitters to the ECS it took multiple gigabytes of RAM. Our quick solution was to lower the entity count and the particle count per emitter. If I had had time, I would have made so when registering a component, the user would also declare how many instances the ECS should contain. This would mean that an entity would have to contain pointers and would add extra cache-misses, but I believe the decrease in RAM-usage would be worth it. This would however mean that the entity array becomes be too big to be stored in cache but we don’t know if it currently is so that might not matter. 

  • The component arrays are ComponentBase*. The ComponentBase are thereafter inherited the templated struct Component.  

 

There reason we did it this way instead of inheritance was to avoid vtables and now an array of Component<X> can be read as an array of X. This means that if someone wants to access multiple components of one type, as in a system, they don’t need to make a function call for every component.

 

 

 

 

 

Working on the array directly like this would however not be possible if the component arrays were of different sizes since the entity id would no longer be its index in the array. This is how the current structure has fewer cache-misses than if the entities contained pointers. Not using inheritance also means that any class/struct can be registered as a component since the sync-function is declared outside the class/struct. This is however not something we took advantage of. If I had implemented this today, I would have used inheritance. Mainly because it is easier to read and it would mean the user doesn’t have to add the sync function separately, but also because inheritance allows for more, like self-registering components.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • Alongside the array the ComponentManager also stores the sync-function, a type_info* and the destructor for the component. The reason why destructor is stored alongside the array is because the components do not use inheritance. We encountered this problem when we wanted to add a component that contained more than trivial data. Namely the ScriptComponent which contains a std::vector. It is however not possible to save a function pointer to destructor, so it instead points a function in a templated struct that calls the destructor.

 

 

 

 

 

It is called from within ComponentManager after sync on components that has been queued for destruction and on all outputs. There is called via the DestructComponent function.

 

 

 

 

  • The SystemManager holds an entry for each registered systems and few simple workers that functions as a threadpool. The SystemManagers purpose is to start every system during run and give it as many threads as it asks for. It is also build in such a way that by defining a macro it can go between running threaded and unthreaded. This is to help with debugging.

 

 

 

A SystemEntry contains a instance of the system to run, the bitmask of components the system requires to act upon an entity, a bool for the purpose disabling a system when it isn’t needed, a type_info* pointer so the SystemManager can upon registration know if it already has that system. Systems are added to the ECS with a function call:

Upon the first call to run it creates a number of Worker instances equal to the hardware concurrency. Each Worker starts a thread running its Work function. The Work function continually check if it has a system to run or if it is to shut down. If neither is the case it yields.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Systems inherit from SystemBase which in turn inherits from VirtualSystemBase. Since the systems run in parallel the system outputs must be stored somewhere while waiting to be delivered in Sync. To avoid new in runtime upon start-up of the game each system allocates a data buffer for the component types it outputs. I did this by making SystemBase a variadic template. This is why VirtualSystemBase and SystemBase are two different classes

 

image2.png
image3.png
image4.png
image5.png
image8.png
image7.png
image1.png
image9.png