Learning Svelto.ECS by Example: The Unity Survival Example


Lately I have been discussing Svelto.ECS extensively with several, more or less experienced, programmers. I gathered a lot of feedback and took a lot of notes that I will be using as starting point for my next articles where I will talk more about the theory and good practices. Just to give a little spoiler, I realized that the biggest obstacle that new coders face when starting using Svelto.ECS is the shift of programming paradigm. It’s astonishing how much I have to write to explain the novel concepts introduced by Svelto.ECS compared to the small amount of code written to develop the framework. In fact, while the framework itself is very simple (and lightweight), learning how to move from the class inheritance heavy object oriented design or even the naive Unity components based design, to the “new” modular and decoupled design that Svelto.ECS forces to use, is what usually discourages people from adopting the framework.

Being the framework extensively used at Freejam, I also noticed that it’s thanks to my continuous availability to explain the fundamental concepts that my colleagues have less of a hard time to get in to the flow. Although Svelto.ECS tries to be as rigid as possible, bad habits are hard to die, so users tend to abuse the little flexibility left to adapt the framework to the “old” paradigms they are comfortable with. This can result in a catastrophe due to misunderstandings or reinterpretation of the concepts that are behind the logic of the framework. That’s why I am committed to write as many articles as possible, especially because I know that the ECS paradigm is the best solution I found so far to write efficient and maintainable code for large projects that are refactored and reshaped multiple times over the span of years and Robocraft as well as Cardlife are the existing proof of what I try to demonstrate.

I am not going to talk much about the theories behind the framework in this article, but I want to remind what took me to the path of ditching the use of an IoC Container and starting using exclusively the ECS framework: an IoC container is a very dangerous tool if used without Inversion of Control in mind. As you should have read from my previous articles, I differentiate between Inversion of Creation Control and Inversion of Flow Control. Inversion of Flow Control is basically the Hollywood principle “Don’t call us, we will call you”. This means that dependencies injected should never been used directly through public methods, in doing so you would just use an IoC container as a substitute of any other form of global injection like singletons. However, once an IoC container is used following the IoC principle, it mainly ends up in using repeatedly the Template Pattern to inject managers used only to register the entities to manage. In a real Inversion of Flow Control context, managers are always in charge of handle entities. Does it sounds like what the ECS pattern is about? It indeed does. From this reasoning I took the ECS pattern and evolved it into a rigid framework to the point it can be considered using it like adopting a new coding paradigm.


The Survival Example

let’s start downloading the project from https://github.com/sebas77/Svelto.ECS.Examples.Survival

Remember this? I rewrote it for you!

open the scene Level01 and open the project in your IDE. Everything starts from maincontext.cs file.

The Composition Root and the EnginesRoot

The class Main is the Application Composition Root. A Composition Root is where dependencies are created and injected (I talk a lot about this in my articles). A composition root belongs to the context, but a context can have more than one composition root. For example a factory is a composition root. Furthermore an application can have more than one context but this is an advanced scenario and not part of this example.

Before to start digging in the code, let’s introduce the first terms of the Svelto.ECS domain language. ECS is an acronym for Entity Component System. The ECS infrastructure has been analyzed abundantly with several articles written by many authors, but while the basic concepts are in common, the implementations differ a lot. Above all, there isn’t a standard way to solve the few problems rising from using ECS oriented code. That’s where I put most of my effort on, but this is something I will talk about later or in the next articles. At the heart of the theory there are the concepts of Entity, Components (of the entities) and Systems. While I understand why the word System has been used historically, I initially found it not intuitive for the purpose, so Engine is synonym of System and you may use it interchangeably according your preferences.

The EnginesRoot class is the core of Svelto.ECS. With it is possible to register the engines and build all the entities of the game. It doesn’t make much sense to create engines dynamically, so all the engines should be added in the EnginesRoot instance from the same composition root where it has been created. For similar reasons, an EnginesRoot instance must never been injected and engines can’t be removed once added.

We need at least one composition root to be able to create and inject dependencies wherever needed. Yes, it’s possible to have even more than one EnginesRoot per application, but this is too not part of this article, which I will try to keep as simple as possible. This is how a composition root with engines creation and dependencies injection looks like:

This code is part of the survival example which is now well commented and follows almost all the good practice rules that I suggest to use, including keeping the engines logic platform independent and testable. The comments will help you to understand most of it, but a project of this size may be already too much to swallow if you are new to Svelto. For this reason, let’s proceed as we would have done if started from scratch:


the first step after creating the empty Composition Root and an instance of the EnginesRoot class, would be to identify the entities you want to work with first. Let’s logically start from the Player Entity. The Svelto.ECS entity must not be confused with the Unity GameObject. If you had the chance to read other ECS related articles, you will see that in many of those, entities are often described as indices. This is probably the worst way possible to introduce the concept. While this is true for Svelto.ECS too, it’s well hidden. As matter of fact I want the Svelto.ECS user to visualize, describe and identify every single entity in terms of Game Design Domain language. An entity in code must be an entity described in the game design document. Any other form of entity definition will result in a contrived way to adapt your old paradigms to the Svelto.ECS needs. Follow this fundamental rule and you won’t be wrong in most of the cases. An entity class doesn’t exist per se in code, but you still must define it in a not abstract way.


Next step is to think about what behaviours to give to this Entity. Every behaviour is always modeled inside an Engine, there is no way to add logic in any other classes inside a Svelto.ECS application. For this purpose we can start from the player character movement and define the PlayerMovementEngine class. The name of the engine must be very specific, as the more specific it is, the higher is the chance the Engine will follow the Single Responsibility Rule. Naming classes properly in Svelto.ECS is of fundamental importance. It’s not just to comunicate clearly your intentions, but it’s actually more about letting you think about your intentions.

For the same reason is as important to put your engine inside a very specific namespace. If you use namespaces according the folder structure, adapt to the Svelto.ECS convention. Using very specialized namespaces helps a lot to identify code design errors when entities are used inside not compatible namespaces. For example, you wouldn’t expect any enemy entity to be used inside a player namespace, unless you want to break the good rules related to modularity and decoupling of objects. The idea is that objects of a specific namespace can be used only inside that namespace or a parent namespace. While with Svelto.ECS is much harder to turn your code in to a fully fledged spaghetti bowl, where dependencies are injected everywhere and randomly, this rule will help you to take your code to an even better level where dependencies are correctly abstracted between classes.

In Svelto.ECS abstraction is pushed on several fronts, but ECS intrinsically promote abstraction of the data from the logic that must handle the data. Entities are defined by their data, not their behaviours. Engines instead are the place where to put the shared behaviours of the same entities, so that engines can always operate on a set of entities.

Svelto.ECS and the ECS paradigm in general allow the coders to achieve one of the holy grails of clean programming, that is the perfect encapsulation of logic. Engines must not have public functions. The only public functions that need to exist are the ones needed to implement framework interfaces. This naturally leads to forget about dependency injection and avoid the awkward code deriving by the use of dependency injection without inversion of control. Engines must NEVER be injected in any other engine or whatever other type of class. If you think to inject an engine, you would just make a fundamental code design mistake.

Compared to Unity monobehaviours, engines already show the first huge benefit, which is the possibility to access to all the entity states of a given type from the same code scope. This means that the code can easily use the state of all the entities directly from the same place where the shared entity logic is going to run. Furthermore separate engines can handle the same entities so that an engine can change an entity state while another engine can read it, effectively putting the two engines in communication through the same entity data. An example of this can be seen with the engines PlayerGunShootingEngine and PlayerGunShootingFxsEngine. In this case the two engines are in the same namespace, so they can share the same entity data. PlayerGunShootingEngine determines if a player target (an enemy) has been damaged and writes the lastTargetPosition of the IGunAttributesComponent (which is a component of the PlayerGunEntity). the PlayerGunShootFxsEngine handles the graphic effects of the gun and reads the position of the currently targeted player target. This is an example of communication between engines through data polling. Later in this article I will show how to let engine communicate between them through data pushing (or data binding). It’s quite logical that engines should (and must) never hold states.

Engines are not supposed to know how to interact with other engines. External communication happens through abstraction and Svelto.ECS solves communication between engines in three different official ways, but I will talk about this later. The best engines are the ones that do not even need to trigger any form of external communication. These engines reflect a well encapsulated behaviour and usually work through a logic loop. Loops are always modeled with a Svelto.Task task inside Svelto.ECS applications. Since the player movement must be updated every physic tick, it would be natural to create a task that is executed every physic update. Svelto.Tasks allows to run every kind of IEnumerator on several types of schedulers. In this case we decide to create a task on the PhysicScheduler that allows to update the player position:

Svelto.Tasks tasks can be executed directly or through ITaskRoutine objects. I won’t talk much about Svelto.Tasks here as I wrote other articles for it. The reason I decided to use a task routine instead to run the IEnumerator implementation directly is quite discretional. I wanted to show that is possible to start the loop when the player entity is added in the engine and stop it when it is removed. However in order to do so, I need to know when the entity is added and removed.

Svelto.ECS introduces the Add and Remove callbacks to know when specific entities are added or removed. This is something unique in Svelto.ECS, but it should be used wisely. I have often seen these callbacks being abused, as in many cases is just enough to query entities. Even holding an entity reference as engine field must be seen as an exception more than a rule.

Only when these callbacks need to be exploited the engine must inherit either from SingleEntityViewEngine<EntityView> or a MultiEntitiesViewEngine<EntityView1,…,EntityViewN>. Again the use of those should be rare and by no means they intend to communicate what entities the engine is going to handle.

Engines more commonly implement the IQueryingEntityViewEngine interface instead. This allows to access the entity database and retrieve data from it. Remember you can always query any entity from inside an engine, but in the moment you are querying an entity that is not compatible with the namespace where the engine lies, then you know you are already doing something wrong. Engines should never assume that the entities are available and they must work on a set of entities. The fact that the game will always have only one player shouldn’t be assumed like I am doing in the code example. A very common approach to how to query entities is found in the EnemyMovementEngine:

In this case the engine main loop is running directly immediately on the predefined scheduler. Tick().Run() shows the shortest way to run an IEnumerator with Svelto.Tasks. The IEnumerator will keep on yielding to the next frame until at least one Enemy Target is found. Since we know that there will be always just one target (another not nice assumption), I pickup the first available. While the Enemy Target can be just one (although could have been more!), the enemies are many and the engine takes care of the movement logic for all of them. In this case I am cheating, as I am actually using the Unity Nav Mesh System, so all I have to do is just to set the NavMesh destination. To be honest with you, I never used the Unity NavMesh code, so I am not even sure of how it works exactly, this code has been just inherited from the original Survival demo.

Note that the component never exposes the Unity navmesh dependency directly. Entity Component, as I will say later, should always expose value types. In this case this rule also allows to keep the code testable, as the field value type navMeshDestination could be later implemented without using a Unity Nav Mesh stub.

To conclude the paragraph related to engines, note that there isn’t such a thing as a too small engine. Hence, don’t be afraid to write an engine even for just few lines of code, after all you can’t put logic anywhere else and you want your engines to follow the Single Responsibility Rule.


So far we introduced the concept of Engine and an abstracted definition of Entity, let’s now define what an EntityView is. I have to admit, of the 5 concepts of which Svelto.ECS is built upon, the EntityViews is probably the most confusing. Previously called Node, name taken from the Ash ECS framework, I realized that node meant nothing. EntityView may be confusing as well, since programmers usually associate views with the concept coming from Model View Controller pattern, however in Svelto.ECS is called View because an EntityView is how the Engine views an Entity. I like to describe it in this way, as it seems the most natural, but I could have also called it EntityMap, as the EntityView maps the entity components that the engine must access to. This scheme of the Svelto.ECS concepts should help a bit:

I suggested to start working on the Engine first, thus we are on the right side of this scheme. Every Engine comes with its own set of EntityViews. An Engine can reuse namespace compatible EntityViews, but it’s more common for an Engine to define its entity views. The Engine doesn’t care if a Player Entity definition actually exists, it dictates the fact that it needs a PlayerEntityView to work. The writing of the code is driven by the Engine needs, you shouldn’t create the entity and its field before to know how to use those fields. In a more complex scenario, the name of the EntityView could have been even more specific. For example, if we had to write complex engines to handle the logic of the player and the rendering of the graphic of the player (or the animation and so on) we could have had a PlayerPhysicEngine with a PlayerPhysicEntityView, then a PlayerGraphicEngine with a PlayerGraphicEntityView or even a PlayerAnimationEngine with a PlayerAnimationEntityView. Even more specific names could be used, like PlayerPhysicMovementEngine or PlayerPhysicJumpEngine (and so on).

EntityViews are classes that hold only Entity Components. Entity Components in Svelto.ECS are always interfaces that must be implemented, but the Engine and EntityView doesn’t need to know the implementation. For this reason, without even implementing the component yet, I can just start to type the logic in the engine like if it was in place:

Even if PlayerEntityView is still an empty class, I start to use the fields I need. Since an EntityView can hold only components, the field must be a component interface.

Hoping you use an IDE that supports refactoring, the IDE will immediately warn you that the field you are trying to access actually doesn’t exist. This is where the refactoring tools can help you speeding up the writing of the code. For example, using Jetbrains rider (but it’s the same with Visual Studio) you can create the field automatically like this:

this would add the component field in the EntityView like:

since the IPlayerInputComponent interface doesn’t exist yet, I name it and use on the spot. Then I use again the refactoring tool:

this will create an empty interface, so that now the code would look like:

the inputComponent field now exists, but it’s empty so the input field is not defined yet, but I know I need it.

Yes that’s right, I would use again the refactoring tool:

so that the IPlayerInputComponent interface will be filled with the right properties. As long as I don’t run the code, I can build it without needing to implement the Entity Component IPlayerInputComponent interface yet. Honestly, once you get in this flow, you will notice how fast can be coding with Svelto.ECS using the IDE refactoring tools.

EntityViews also help to keep the code modular. If engines would be able to access to all the entity components, you wouldn’t be pushed to think what the engine actually really needs to do. You could fall into the trap to add several responsibilities inside the same engine as you can easily access all the properties, even if they are not closely related to that engine behaviour.


We understood that engines model behaviours for a set of entities data and we understood that engines do not use entities directly, but use the entity components through entity views. We understood that an EntityView is a class that can hold ONLY public entity components. I also hinted that entity components are always interfaces, so let’s give a better definition now:

Entities are a set of data and entity components are the way to access that data. In case you haven’t notice it yet, defining entity components as interfaces is another pretty unique feature of Svelto.ECS. Commonly components in other frameworks are objects. Using interfaces instead helps dramatically to keep the code abstracted. If you follow the Interface Segregation Principle writing small component interfaces, even with just one property each, you will notice that a given point you will start to reuse component interfaces inside different entities. In our example ITransformComponent is reused inside many entity views. Using components as interfaces also allow them to be implemented with the same objects, which in many case allows to simplify the communication between entities that see the same entity through different entity views (or the same one if possible).

Therefore, in Svelto.ECS, an entity component is always an interface and this interface is only used through the field of an EntityView inside an Engine. The Entity Component interface is then implemented with a so called Implementor. We are now starting to define the Entity itself and we are on the left side of the scheme above.

Components should always hold value types and the fields are always getter and setter properties. Exceptions may be made only to write setters ad getters as methods to exploit the ref keyword when this optimization is needed. This doesn’t really mean that the code is data oriented, but it would allow to create testable code as the Engine Logic shouldn’t deal with references to external dependancies. Moreover it prevents people from cheating and use public functions (which would include logic!) of random objects. The only reason one could feel the need to use references inside Entity Components interfaces is to deal with third party dependencies, like unity objects. However, the Survival example, shows how to deal with this case too, leaving the engines code testable without need to be concerned with Unity dependencies.


This is where the Entity Descriptors actually come to help to put everything together and hopefully let everything click in place. We know that Engines can access to Entity data through the Entity Components held by the Entity Views. We know that Engines are classes, EntityViews are classes that hold only Entity Components and that Entity Components are interfaces. While I have given an abstracted definition of entity, we haven’t seen any class that actually represent an entity. This is in line with the concept of entities being just IDs inside a modern ECS framework. However without a proper definition of Entity, this would lead coders to identify Entities with EntityViews, which would be catastrophically wrong. EntityViews is the way several Engines can view the same Entity but they are not the entities. The Entity itself should be always be seen as a set of data defined through the entity components, but even this representation is weak. An EntityDescriptor instance gives the chance to the coder to name properly their entities independently by the engines that are going to handle them. Therefore in the case of the Player Entity, we would need an PlayerEntityDescriptor. This class will then be using to build the entity, and while what it really does is something totally different, the fact that the user is able to write BuildEntity<PlayerEntityDescriptor>() helps immensely to visualize the entities to build and to communicate the intentions to other coders.

However what an EntityDescriptor really does is to build a list of EntityViews!!! In the very early stages of the framework development I was letting the coders building this list of EntityViews manually, which was leading to very ugly code as the couldn’t visualize anymore what an entity actually was.

This is how the PlayerEntityDescriptor looks like:

the EntityDescriptors (and the Implementors) are the only classes that can use identifiers from multiple namespaces. In this case the PlayerEntityDescriptor defines the list of EntityViews to instantiate and inject in the engine when the PlayerEntity is built.


The EntityDescriptorHolder is an extension for Unity and should be used only in specific cases. The most common one is to create a sort of polymorphism storing the information of the entity to build on the Unity GameObject. In this way the same code can be used to build multiple type of entities. For example, in Robocraft, we use a single cube factory that builds all the cubes of the machine. The kind of cube to build is stored in the prefab of the cube itself. However this is fine as long as implementors are the same between the cubes or found on the gameobject as monobehaviours. Building entities explicitly is always preferred, so use EntityDescriptorHolders only when you have understood Svelto.ECS properly otherwise there is the risk to abuse it. This function from the example shows how to use the class:

Note that with this example I am already using the less preferred, not generic, function BuildEntity. I will talk about it in a bit. The Implementors in this case are always monobehaviours in the gameobject. Also this is not a good practice. I actually should remove this code from the example, but left to show you this other case. Implementors, as we will see next, should be Monobehaviours only when strictly needed!


Before to build our entity, let’s define the last concept in Svelto.ECS that is the Implementor. As we know now, Entity Components are always interfaces and in c# interfaces must be implemented. The object that implements those interfaces are called Implementors. Implementors have several important characteristics:

  • Allow to uncouple the number of objects to build from the number of entity components needed to define the entity data.
  • Allow to share data between different components, as components expose data through properties, different component properties could return the same implementor field.
  • Allow to create stub of the entity component interface very easily. This is crucial for leaving the engine code testable.
  • Act as bridge between Svelto.ECS Engines and third party platforms. This characteristic is of fundamental importance. If you need unity to communicate with the engines you don’t need to use awkward workarounds, simply create an implementor as Monobehaviour. In this way you could use, inside the implementor, Unity callbacks, like OnTriggerEnter/OnTriggerExit and change data according the Unity callback. Logic should not be used inside these callback, except setting entity components data. Here an example:

Remember the granularity of your EntityViews, entity components and implementors is completely discretional and up to you. More granular they are, more the chance are to be reusable.

Build Entities

Let’s say we have created our Engines, added them in the EnginesRoot, created its EntityViews that use Entity Components as interfaces to be implemented by one or more Implementors. It is now time to build our first entity. An Entity is always built through the Entity Factory instance generated by the EnginesRoot through the function GenerateEntityFactory. Differently than the EnginesRoot instance, an IEntityFactory instance can be injected and passed around. Entities can be built inside the Composition Root or dynamically inside game factories, so for the latter case passing the IEntityFactory by parameter is necessary.

The IEntityFactory comes with several look alike functions. For the purposes of this article I will skip explaining the PreallocateEntitySlots<T> and the BuildMetaEntity<T> functions to focus on the most commonly used BuildEntity<T> and BuildEntityInGroup<T>.

BuildEntityInGroup<T> should actually always preferred, but I didn’t need it for the Survival example, so let’s see how the normal BuildEntity<T> is used inside the:

EnemySpawner Code

Don’t forget to read all the comments in the example, they help to clarify even more the Svelto.ECS concepts. Due to the simplicity of the example, I am actually not using the BuildEntityInGroup<T> which is instead commonly used in more sophisticated products. In Robocraft every engine that handles the logic of the functional cubes handles the logic of ALL the functional cubes of that specific type in game. However often is needed to know to which vehicle the cubes belong to, so using a group per machine would help to split the cubes of the same type per machine, where the machine ID is the group ID. This allows us to implement fancy things like running one Svelto.Tasks task per machine inside the same engine, which could even run in parallel using multi-threading.

This piece of code highlight one crucial issue, which I may talk more about in the next articles…from the comment (in case you haven’t read it):

Never create implementors as Monobehaviour just to hold data. Data should always been retrieved through a service layer regardless the data source. The benefit are numerous, including the fact that changing data source would require only changing the service code. In this simple example I am not using a Service Layer but you can see the point. Also note that I am loading the data only once per application run, outside the main loop. You can always exploit this trick when you now that the data you need to use will never change.

Initially I was reading the data directly from the monobehaviour like a good lazy coder would have done. This forced me to create an implementor as monobehaviour just to read serialized data. It could be considered OK as long as we don’t want to abstract the data source, however serializing the information into a json file and reading it from a service request is much better than reading this kind of data from an entity component.

Every entity needs an unique ID. This unique ID must be unique regardless the descriptor type and the group it belongs to. I took this decision recently, so if I say otherwise in other articles, please let me know I will fix it.

Communication in Svelto.ECS

One problem which solution has never been standardized by any ECS implementation is the communication between systems. This is another place where I put a lot of thought on and Svelto.ECS solves it in two novel ways. The third way is the use of the standard observer/observable pattern which is acceptable in very specific and uncommon cases.


We previously saw how to let engines communicate through entity components data polling. DispatchOnSet<T> and DispatchOnChange<T> are the only references (not value type data) that can be returned by entity components properties, but the type of the generic parameter T must be value type. The function names sound like event dispatcher, but instead they must be seen as data pushing methods, as opposite of data polling, a bit like data binding works. That’s it, some times data polling is awkward, we don’t want to poll a variable every frame when we know that the data changes rarely. DispatchOnSet<T> and DispatchOnChange<T> cannot be triggered without a data change, this will force to see them as a data binding mechanism instead of a simple event. Also there is no trigger function to call, instead the value of the data hold by these classes must be set or changed. There aren’t great examples in the Survival code, but you can see how the targetHit boolean of the IGunHitTargetComponent works. The difference between DispatchOnSet<T>and DispatchOnChange<T>is that the latter triggers the event only when the data actually changes, while the former always.

The Sequencer

Perfect engines are totally encapsulated and you can write the logic of that engine as a sequence of instructions using Svelto tasks and IEnumerators. However this is not always possible, as some times engines need to signal events to other engines. This is usually performed through entity data, especially using DispatchOnSet<T>and DispatchOnChange<T>, however like in the case of entities being damaged in the example, a series of independent and uncoupled engines are acting on it. Other times you want the sequence to be authoritative on the order of engines to call, like in the example where I want the death always happening for last. Not only a Sequence is very easy to use in this case, but it’s also very handy! Refactoring the sequence is super simple. So use IEnumerator Svelto Tasks for “vertical” engines and sequence for “horizontal” logic between engines.


I left the option to use this pattern especially for cases when legacy or not Svelto.ECS code must communicate with Svelto.ECS engines. For all the other cases, it must be used with extreme caution, as there is a high chance to abuse the pattern since it looks familiar to the coder new to Svelto.ECS and Sequencers are usually a better choice.

Svelto.ECS and Unity

Svelto.ECS (as well as Svelto.Tasks) is designed to be platform agnostic. However I mainly use it with unity, so Unity extensions are provided. The EntityDescriptorHolder is an example. Using implementors as Monobehaviour let to exploit most of the Unity callbacks, but on other platforms the reasoning could be very similar. All that said, Svelto.ECS gives you the chance to abstract from Unity and you should use Unity classes as less as possible, especially Monobehaviours. It’s also important to keep in mind that the creation of GameObject(s) is uncoupled from the creation of Entities, the only thing they have in common is the fact that gameobject monobehaviours can be implementors. You can see from the example that enemy gameobjects and their ECS entities are built indipendently.

Logic inside utility classes

You can create static utility classes to share code, that is not a problem as long as the static classes do not hold any state (they must be just a set of static functions)

closing notes EntityStruct:

I left the EntityStruct concept out of this article. As you may know, Unity is pushing their ECS framework as a mere optimization tool. As I am trying to explain in these articles, this is a mistake, but not only because the ECS paradigm is actually a great way to write maintainable code, but also because in real life, cases where the extreme optimizations resulting from writing cache friendly code is useful are very limited. However EntityStruct are cool and useful for multithreading code as well, but I will need to write better examples to show their power.


If you this article is not enough, you can also read the old ones: 

Svelto ECS is now production ready

if you are new to the ECS design and you wonder why it could be useful, you should read my previous articles:


That’s all folks! FEEDBACK ME!

Learning Svelto Tasks by example: The million points on multiple CPU cores example [now with Unity Jobs System comparison]

[11/03/2018] : added Unity Jobs System version and updated timings. Please check at the end of the article.

With my previous article on Svelto.Tasks and multi-threaded cache friendly code, I failed to show visually the power of Svelto.Tasks because I didn’t know how to upload a huge amount of data to the GPU without stalling the main thread. Honestly, I stopped being a graphic programmer right before DX 11 was introduced, so my knowledge of modern pipelines is limited. I also thought that to find a good solution I would have needed to write some kind of low level workaround to overcome the Unity API limitations, but with my great surprise, actually Unity has already got what I needed. It was just necessary some investigation and help to find out :).

With my great astonishment, using the compute buffers API is possible to upload every frame a huge amount of data without affecting the CPU too much. I still have to understand how this works and which DX 11 function ComputeBuffer.SetData maps onto, so if you know, please leave a comment, as I need to understand it, although it is not my priority for this current demo. As matter of fact, it was enough for me to know that uploading the vertices transformed on the CPU would not affect the final performance in a significative way.

I immediately threw some lines of code to show off how simple and efficient is to work with multi-threads with Svelto.Tasks and after combining it with the new IL2CPP feature of Unity 2018, I achieved the incredible number of 1 million particles transformed on the CPU at >30fps!! If someone would have told me that this was possible, to be honest, I would have had my doubts. However let me clear, these kind of demo are pretty lame because it doesn’t make much sense to do these kind of operations on the CPU. This is exactly what the GPUs are for. It’s more a show off than something practical, but the library, obviously, can be used for better cases.

I was also lucky enough to find a good and, most of all, simple demo on github, called MillionPoints, which does what I needed, that is transforming 1 Million points on the GPU using Compute Shaders and the Graphics.DrawMeshInstancedIndirect function. All I had to do is to convert the simple compute shaders code in pure c# and make it run on the CPU.

While I still don’t consider my self a multi-threaded code expert, because one never stops to learn and I didn’t have the chance to use multi-threading in sophisticated algorithms yet, I may dare to say that I start to have a good understanding of all the problems involved and consequentially I designed Svelto.Tasks to be very simple to use in a multi-thread environment, exactly like the new unity c# job system does. Since in Freejam we hire a lot of junior programmers, I had the chance to see first hand all the problems that could arise to give advanced tools to inexpert hands. That’s why I had to design something very straightforward to use and most of the time, worries free. This is what I hope to have achieved with Svelto.Tasks, improving it constantly over the years.

There are two keys elements that make Svelto.Tasks powerful: the runners (or schedulers) and the continuation. The runners are designed to run every kind of IEnumerator (or often called task) on every kind of defined Runner. Svelto.Tasks already ships with a lot of Unity related Runners, but two are platform agnostic: the MultithreadRunner and the SyncRunner.

The concept of continuation (similar to await/async) is even more powerful. It allows to start a task running on a specific scheduler from another task running on another scheduler and continue from there once the new task is finished. This is much simpler to code than to explain. Although there are similarities, Tasks.Net and Svelto.Tasks are obviously designed differently as the latter is designed around all the problematic intrinsic in a game development production, including performance.

Enough talk now so let’s dive in the details. Open the scene main.unity and click on the MillionPoints GameObject. Be sure that the MillionPointsCPU Monobehaviour is enabled and the GPU one is disabled. Run it, don’t expect great performance in the editor though, there is a huge difference between editor and clients in this case. I will show all the differences in bit. BTW I am currently using Unity 2018.1b3.

The goal is to perform the following cache friendly CPU instructions inside the ParticleCPUKernel MoveNext() method inner loop on 1 Million points every frame using tasks running on multiple cores:

I skip all the compute buffer initialization stuff because is not relevant for this article and I start directly from the function StartSveltoCPUWork(). First of all, I decided to split the job in 16 threads. As you know, when it’s time for CPU intensive work, increasing the number of threads more than the number of available cores gives a logarithmic-like gain, so even 16 is already probably too much on my 8 cores machine. Since the operations to apply on the vertices are quite straightforward, we can just have each thread operating on a specific segment of the vertices array and this is what the function PrepareParallelTasks method does.

The MultiThreadedParallelTaskCollection has a similar interface to the simpler ParallelTaskCollection, but it is able to run N tasks on M threads using M ParallelTaskCollections. You may have figured out how this works already. it basically creates M Threads and runs one ParallelTaskCollection on each. The N tasks are split among the M ParallelTaskCollections, so the execution of M ParallelTaskCollections, and not N tasks, are truly in parallel. When N coincide to M, then all the tasks run in parallel like in the case of this example. In this case we initialize 16 ParallelTaskCollections running 1 task each. This task applies the ParticlesCPUKernel methods instructions on 1.000.000 / 16 particles.

Remember, the MultiThreadedParallelTaskCollection is just an IEnumerator than must run like the others tasks in Svelto.Tasks, so it doesn’t just start on its own. For the purpose of this article, I show two different ways to start the collection execution. All the following code assumes that you know well how to work with the yield instruction.

Let’s start from enabling the #Test2 define as it is the simplest scenario. In this case, everything is driven by a single loop running on the main thread.

As I said, this is the simplest flow to understand. The intention is to wait for the other threads to finish, that’s why I run the multithreaded parallel collection execution on the SyncRunner that is designed to stall the current thread. Once the execution is completed, the rest of the code will run on the main thread.

This is probably not the best solution, because the other threads could start compute the particles operations for the next frame between the end of this function and its next execution. Let’s see this visually:

in the way we are running the loop, the multithreaded collection starts during the Update phase, as the main thread loop is running on the Update Runner. While it’s running it stalls the main thread, so nothing else can be executed, then it finished and the process continue. The red vertical line can give an idea about where the threads start and complete to run. However why should the update phase wait for the threads to finish their operation when those could have ran outside the Update phase in parallel with it?

We could invert the way we trigger the multi-threads operation so that we compute the next values in between updates instead than inside the update phase.

There is more than one way to achieve this. Let’s see an example enabling the define #Test1

In this case we have two loops. One running on the main thread, as we have started the MainThreadOperations task using the standard UpdateRunner, and the other on an other thread, starting OperationsRunningOnOtherThreads on the standard MultiThreadedRunner before the main loop starts. Note that the standard threaded runner is used just to check when the _multiParallelTask completes, as the _multiParallelTasks uses its own set of threads to run.

At this point _waitForSignal and _otherwaitForSignal are used to signal when the operations on each thread are completed. I hope this is intuitive. The main thread first waits for the other threads to finish, when this happens the draw mesh is issued and the main thread will signal the other thread to start the operations for the next frame. Since the other thread can finish running before the next update, it will yield its execution to other threads until the are signalled to compute the next values.

The _breakit part is a bit awkward at the moment. It’s necessary to be sure that the threads will stop when the code is executed in the editor, as stopping the execution in the editor won’t kill the running threads like it normally happens with a standalone application (note, it has been eliminated with the latest updates).

Last scenario is an alternative possibility:

so what’s the deal here? Again two loops, although this time the one running on the main thread just keeps on issuing the draw mesh with the last updated particles. In the multithreaded loop instead continuation is exploited to set the freshly computed particles to the compute buffer. Since Unity will throw an exception if this is done on other threads, we have to run the task on the main thread and wait it to finish.

The frame rate for this case is much higher, but don’t be fooled, the reason being that the mainthread this time is not stalled, however the particles will be updated at the same frame rate of the other cases, more or less.

Before to test the performance, let me spend two words about the cache friendly code created to compute the particle values. As you may notice, I use a lot of ref and out. This is because it’s very important to avoid copying structure when not necessary as it can hit hard the stack. This is also why c# 7.2 has recently introduced the byref returning value and byref variables, so that I can write simpler code to avoid copies on the stack of structs. You should always pass your Vector[N] by ref or out.

Let’s do some benchmarking now, using the #Test1 case:

Mono/.Net 4.6 ~20fps
IL2CPP ~48fps
UWP ~23fps
UWP .Net Native ~59fps(*)

Last test was just an experiment. I knew that UWP .net core code can be compiled in native code through the .net native toolchain, therefore I had obviously to compare it against IL2CPP. The result make me think that a future integration in Unity for standalone platforms, if possible, would be beneficial (update: now available in Unity 2018, * I wasn’t able to reproduce the UWP .Net Native timings, so I may have done something wrong then. They are like the IL2CPP timings in my new tests)

And finally the project can be downloaded from github as usual: https://github.com/sebas77/Svelto.Tasks.MillionPoints

P.S.: If you notice that Svelto.Tasks code can perform better, please tell me, I am sure there can be some areas to improve, as I continuously do it.

Update: Svelto.Tasks and Unity Jobs System comparison

Obviously I was curious to see how Unity Jobs Systems compares with Svelto.Tasks. Both systems have been designed to write multi-threaded code worries free, but Svelto.Tasks relies only on the power of c#, while Unity Jobs is allegedly mainly written in native code, exploiting the internal job workers of the Unity Engine.

There isn’t any difference between threads in c# and threads in c++. Threads are anyway handled by the operative system, therefore in both languages what it’s implemented is just a wrapper of the underling system. For this reason, I had some concerns about Unity Jobs System, as we have noticed in the past how marshaling could affect the performance of c# code.

However I can confirm that, once compiled, Unity Jobs runs at the same speed of Svelto.Tasks with very similar results. The IL2CPP still needs some optimizations, as Svelto.Tasks there is actually faster. As IL2CPP is not affected by the marshalling issue, it’s very likely that the Unity Jobs work in IL2CPP is not completed.

Let’s start from the standard mono version first. I have updated the code in github and the Unity Jobs System version is under the folder UnityJobsKernel.

I have compiled the “Naive” (#define TEST2) version to execute Svelto tasks and the Signal based version (#define TEST1). Both of them execute at the same speed of the Unity Jobs version. It’s important to note that the Unity Jobs version maps almost exactly to the “Naive” version of the Svelto solution.

Now you would wonder, why is the Naive version not slower than the Signal based version if it’s so Naive? I will come to that later, as it actually behaves exactly as I assumed, meaning that in a real world scenario, the “advanced version” could be faster than either the Naive version and the Unity jobs system version.

Let’s see the results of the 1 Million Points simulation (Unity Jobs is not available in UWP yet, hence I couldn’t test the performance there). This time I am measuring milliseconds ranges (the lower the faster)

Svelto Tasks Svelto Tasks Naive  

Unity Jobs System

Mono 56-59 57-59 56-57
IL2CPP 24-26 24-25 29-30
UWP 45-49 45-49 n/a
UWP Native 23-24 23-24 n/a

Pretty close right? Difference in IL2CPP could be significant, but Unity will very likely improve it. I am not sure what happened with the UWP Native Toolchain profiling there. Somehow I am not able to reproduce the timings of the first profiling. Maybe I did something wrong then? I don’t care investigating as the platform is not a priority for me and actually it would mean that IL2CPP is as good as the .Net Native Tooilchain platform.

I will not dig in the Unity jobs details. I don’t have the time and it’s out of the scope of this article. However I will explain why it maps to the naive version of Svelto.Tasks solution. As explained in the first part of this article, the naive version stalls the main thread until the offloaded operations are not completed. Working on Svelto.Tasks I learned that the concept of main thread is honestly obsolete. I would love to work with an engine where the main thread is not a thing, after all that’s also the mentality behind DX 12 and Vulcan. Even with the Svelto.Tasks Vanilla Example the applications runs in its own thread that is not the “main thread”.

However following the “naive” approach, its not optimal code must rely exclusively on number of CPU cores and their power, more than on the fact that the code is multi-threaded. This is what I explained above when I showed the unity frame rendering. We don’t want just to trigger a “burst” of operations, we actually should be able to run code in parallel with the rest of the unity pipeline.

As showed, the “Advanced” and “Naive” update have similar results, but what would happen if the main-thread executes other heavy operations outside the Job update? Let’s see what happens if we add a Thread.Sleep(10) in the main update simulating a main thread taking 10 milliseconds extra for other operations:

  Svelto Tasks Unity Jobs System
Mono 57-62 68-70

Errata Corrige:

actually there is a way to run the just scheduled jobs without needing to call the Complete() method. This can be done through the static ScheduleBatchedJobs() method of the JobHandle class. I changed the Unity Jobs System kernel in this way:

and now the new timings are more in line with Svelto.Tasks:

Svelto Tasks Unity Jobs System
Mono 57-62 59-62

This is closer to a real life scenario than all the Unity Jobs demo showed so far. I understand anyway, Unity wants to keep things simple. However I don’t think with Svelto they are so hard, so personally I will keep on using Svelto (for this and many other reasons), but in future I will integrate the Unity Jobs because of the future coming “burst” technology, however this one must perform faster than IL2CPP, otherwise there is no real point in using it.

As usual, don’t quote me without testing yourself! I write these articles only when I have time, which usually is during the late night, so it’s better for you to double check always, running my example that you can find on github 😉

Learning Svelto.ECS by example – The Vanilla Example

It’s not simple to learn a new framework and even less shift code paradigm. This is why Svelto.ECS has been written with simplicity in mind. Even so, I realised that I ought to write more tutorials to clarify some simple concepts that are not straightforward to grasp at first. That’s why I thought to explain, step by step, what I have done with the examples bundled on github.

A word before to start: I would really appreciate if you send me your feedback and, when you start to have a good understanding of the framework, share your knowledge with others. I am willing also to review your code when possible, if you share it publicly, for example on github. You can also feel free to update the Svelto.ECS wiki page and send pull requests. You may also think that it would be better to study Unity ECS framework and in this case I would say that being an open sourced c# framework built on top of the Unity engine, it will probably do the same things Svelto.ECS does, only in a different way. At the moment I recognize that Svelto.ECS has some unique features that cannot be found on other currently available frameworks, so it’s your choice to take advantage of them and compare them. Svelto.ECS has been engineered to be both fast and convenient to use, but also rigid enough to dictate, when possible, the right way to design your code, reliving the coder from the responsibility to take the right decisions to write maintainable code easy to refactor without penalizing performance.

All that said, let’s start from the simplest possible project! This project doesn’t even use Unity, so you can just run it with Visual Studio or compatible IDE/compiler. Just for fun I decided to use the .Net Core and .Net Standard frameworks (yes both!!) but the code will compile also on the classic .Net framework. Get it from https://github.com/sebas77/Svelto.ECS.Vanilla.Example

It’s a bit longer than I would have liked because I decided eventually to cram inside most of the features available. Still, if you remove the comments, the real lines of codes aren’t many!

Let’s have a look at the lot, hoping you will not get lost (you shouldn’t)

Well, well, still there? I put the whole code in one file only to demonstrate that the boilerplate to write is very minimal, remember I am showing virtually all the features available in the framework. Now, instead to repeat what I wrote in the code comments, I will try to explain the concept behind it. Please be sure to have understood the previous code before to continue reading. When I develop with Svelto.ECS (always now :)), the real first task I perform is to identify conceptually the entities I want to manage. This first step is of crucial importance and I will explain better why in the next articles about pitfalls and best practices. You must strive to identify an Entity inside the game design domain, using a game design language. For example, good entities are: MainCharacterEntity, CoinGUIWidgetEntity, MinimapEntity, TileEntity (if you have tiles that perform any logic) and so on. In the example, I identified the entities SimpleEntity and SimpleEntityStruct. Those are really bad names, first because they are conceptually abstracted and they must not be, second because the fact that an Entity will be built in a group doesn’t actually identify a new entity as the same entity can be built either grouped or not. Those names are for the purpose of the simplest example I could write, so forgive me to break the rules.

Let’s pick-up now the entity we want to start to work with, for example the SimpleEntity. The next step is to add behaviours to this entity and in order to do so an Engine must be created (If you never read any article of mine before, please note that I chose the word Engine over System as I find it more meaningful). Now for the sake of the example I called it BehaviourForSimpleEntityEngine but you have to name your engines according the behaviour you want to implementFor example good names for Engines are: WingsPhysicEngine, WheelsRenderingEngine, MachineInputEngine and so on…

In a classic ECS framework, Systems usually handle entities directly. Instead I decided to use a different approach that I first saw in the Actionscript ECS framework called Ash, introducing the concept of EntityView (previously named Node in Svelto.ECS 1.0). It’s very simple: an EntityView is how the Engine must see the Entity. It’s basically a filter to not let the Engine have access to all the Entity data. In this way you can focus the engine responsibility, since global data access would promote less modularity. If the Lazy Coders(*) could access to ALL the entity components nothing would keep them from adding more logic in the engine that is not really responsibility of that engine. EntityViews promote encapsulation, modularization, single responsibility and design rigidity.

In our case the BehaviourEntityClassEngine can see and handle the entities SimpleEntity through the views BehaviourEntityViewForSimpleEntity (yeah, it’s pretty hard to name abstract concepts)Entity views can be created as classes for flexibility and as structs for speed. I will talk about EntityViews more on other articles as their properties must be explained properly, but I don’t want to add too much information right now.

Now careful about this: an Engine usually should just implement the IQueryingEntityViewEngine interface  as most of the time is enough to query the entities you want to handle with the EntityViewsDB. However Svelto.ECS has also the feature to know when entities are removed and added from the engine through the Add and Remove callbacks. If you want to enable them, you can use the SingleEntityViewEngine and MultiEntityViewsEngine pre-made classes which help to reduce boilerplate code. You will notice that the MultiEntityViewsEngine doesn’t accept more than four EntityViews. This because empirically I noticed that an engine handling more than four different entities is very likely to have more responsibilities than should and must be refactored. Also note, engines should never contain states and if you design your entity views properly, you don’t really need to. A perfect engine is totally encapsulated and its methods are both private and static (pure encapsulated logic). Advanced scenarios will be explained in other articles.

EntityViews group Entity components. When EntityViews are classes, EntityViews group components as interfaces. In Svelto.ECS components are always seen as interfaces, this is another unique feature and I will explain the benefits on other articles. In our simple example, the SimpleEntity is composed by just one component the ISimpleComponent. An entity can be composed by several components and an EntityView can reference more than one entity component, but for the time being, let’s keep things simple. When Entity components are structs, they actually coincide with the components themselves. Entity as struct is an advanced concept and I don’t need to explain it now, as you would need to use them only in very specific scenarios or just for fun.

OK that’s it, the code now compiles as you have all the elements needed, however it will not run since those components are not implemented…well that’s right, I didn’t explain how to actually build the entities. Build entities come for last in the order of things to do.

Often you will read that in modern ECS frameworks entities are just IDs. This concept can be extremely confusing for a person who comes from an object oriented background, so for the moment ignore it. In Svelto.ECS you have a real convenient way to identify an Entity and this is the EntityDescriptor. In the example I avoided all the boiler plate needed to create an EntityDescriptor using the GenericEntityDescriptor and the MixedEntityDescriptor pre-made classes. Use them whenever possible.

The EntityDescriptor links Entity Components to the relative EntityViews and/or EntityStructs. If EntityViews as classes are used, we just need a way to implement those components as interface. Finally this is what the Implementors are for! Implementor is a very powerful concept, but I don’t need to explain them in detail now. After all, from the example, it should be simple to understand how to use them.

OK This should be enough to give you an introduction to the framework! Is something not clear? Please leave a comment here, I will try to help!

In summary:

  • Always define your Entities conceptually first
  • Start to write Engines and identify the EntityViews that these engines need. EntityViews define how the Engine must see the entities it must handle, so EntityViews are named after the Engine and the Entity and must be created together with the Engine and in the Engine namespace.
  • Define the Entity Components the engine must access to while you code the Engine behaviours, use refactoring tool to easily generate the interfaces you need (I will explain some tricks on other articles)
  • Create the EntityDescriptors that link the Entities to the EntityViews
  • Finally implement the Entity Components as interfaces through Implementors and build your entities. The EntityView instances will be automatically generated and inserted in the EntityViewsDB by the framework.

(*) Lazy coders are the best, but if they lack of discipline they could lead to catastrophes 🙂

if you are new to the ECS design and you wonder why it could be useful, you should read my previous articles:

Check also the other ECS articles:

Learning Svelto.ECS by Example: The Survival Example

and for reference:

Svelto ECS is now production ready

Svelto.ECS 2.0 is production ready

This an introductory post to Svelto.ECS 2.0. Other two posts will follow, one explaining the examples line by line and another explaining all the Svelto.ECS concepts in a simpler fashion than before. Therefore this article is written for who already knows Svelto.ECS.

Svelto.ECS 2.0 is (at the time of writing this) available in alpha stage (pick the alpha branch to have a look now) and, compared to Svelto.ECS, introduce new features, new optimizations and unluckily some breaking changes (main reason why I decided to bump the major version).

I have also updated the Svelto.ECS.Examples code (available on the alpha branch at the time of writing) and added a new “vanilla” example which I can consider the minimum amount of code to write to show almost all the framework features available without using Unity.  The new example is, in fact, targeting Net.Core and Net.Standard.

I will quickly go through the list of new features:

  • New terminologies have been introduced. If you read the previous articles or used Svelto.ECS 1.0, you must know that the term “Node” has been abolished as it was meaningless in the framework context and has been renamed to “EntityView“. I will explain why it makes more sense with the next articles.
  • The GenericEntityDescriptor has been extensively used in all the examples to show how to reduce the boilerplate code with minimum effort.
  • EntityDescriptors are now pure static classes and must not be instantiated. The EntityDescriptor must conceptually identify your entity. The future articles will explain why it’s a big design mistake to identify Entities as EntityViews instead. A Entity is now built with the new signature entityFactory.BuildEntity<SimpleEntityDescriptor>(entityID, implementors). In any case you should rarely need to inherit from EntityDescriptor as most of the time you must inherit from GenericEntityDescriptor and MixedEntityDescriptor.
  • EntityDescriptors can now refers, using the same signature, either to class based EntityViews and struct based Entities, so that entityFactory.BuildEntity<SimpleStructEntityDescriptor>(entityID); will actually build struct based entities. Previously the struct based entities were build in a more awkward fashion. While I deprecated the previous method to build struct entities, changes may be introduced later to support thread safe code as currently Svelto.ECS data structures are not thread safe (custom structures or strategies must be adopted to support thread safe engines). With the next articles I will explain when and how to use EntityViews and Entities as struct. EntityViews should be used for flexibility and Entities as structs for speed. An EntityDescriptor must inherit from the new MixedEntityDescriptor generic class to build entities made out of Entity structs and/or EntityViews.
  • Another very important feature is the possibility to create entities inside buckets (groups). In this way is possible to query a set of entities from a specific group. This is a feature that was absolutely needed in our products (and very likely in yours too). For example we can now easily query all the wings for each machine in Robocraft. This means that we could potentially create a Svelto.Tasks taskroutine for each machine inside the WingsPhysicEngine and stop a specific task if all the wings of a specific machine are shot off. Previously there wasn’t a way to be able to split wings per machine unless custom datastructures were used inside the engines. The new function has the signature: entityFactory.BuildEntityInGroup<SimpleStructEntityDescriptor>(entityID, groupID) and can be used both for struct Entities and class EntityViews.
  • Entities can also be moved between groups. This was a last minute idea and I need to experiment more with it.
  • EnginesRoots must never be hard referenced. Previously I made two mistakes: first hard referencing it inside the SubmissionNodeScheduler (now called SubmisisonEntityViewScheduler) and second letting it be passed around through the IEntityFactory interface. Now an IEntityFactory wrapper must be generated from the EngineRoot.
  • A IRemoveComponent cannot be implemented with a custom implementor any more. Just use it and don’t worry to pass a RemoveImplementor among the other implementors (it will be ignored as it is created by the framework now).
  • The framework is now more rigid and output more warnings in case of misuse.
  • All the previous framework engines, except for the SingleEntityViewEngine and MultiEntityViewsEngine, have been deprecated. An engine can inherit either from one or the another and/or implement IQueryingEntityViewEngine (which is an IEngine too now)
  • I am adding comments all over the code, but I will add more while I write the new articles to come.

then optimizations:

  • Building an entity can be up to 3.3 times faster than in svelto.ECS. However remember that entities should be sporadically or never built during the execution of the gameplay. Entities should always be prebuilt and enabled/disabled when needed. This is even more effective than pooling them, so design them properly. Below a table to show how long takes to build 256k entities as class with Unity 2017.3 (building entities as struct is an order of magnitude faster).
    Version Platform Time(MS)
    1.0 .net 2.0 903
    2.0 .net 2.0 344
    2.0 .net 4.6 260(!)
    2.0 .net Core UWP 203
    2.0 .net Core IL2CPP 288

    if you wonder while IL2CPP is a bit slower, is because I cannot dynamically create code that allows avoid Reflection functions, however the c++ implementation of the reflection is much faster so no much trouble there.

  • Thanks to the new feature to move entities between groups, you can create a group to store the disabled entities, so that you can retrieve a disabled entity yourself to be reused later.
  • Several optimizations here and there, especially to reduce allocations.

and breaking changes (at least the ones I took note of):

  • The BuildEntity signature is totally different and will need to be explained in an another article.
  • IRemoveComponent was breaking a not enforced (yet) rule of Svelto.ECS which is that a component cannot hold references to other classes (except Unity ones at the moment). This is because a component must be seen a data container and cannot be used to call external functions. Thus, the method to remove an entity has changed to _entityFunctions.RemoveEntity<SimpleEntityDescriptor>(entityView.ID);
  • Mass Renames (hope I didn’t get any wrong, I will fix later in case):
    • rename IQueryableNodeEngine into IQueryingEntityViewEngine
    • rename IEngineNodeDB into IEngineEntityViewDB
    •  rename NodeWithID into EntityView
    •  .QueryNode< to .QueryEntityView<
    • .QueryNodes< to .QueryEntityViews<
    • .TryQueryNode( to .TryQueryEntityView(
    • all the nodesDB must be renamed to entityViewsDB
    •  MultiNodesEngine to MultiEntityViewsEngine
    • SingleNodeEngine to SingleEntityViewEngine
    • UnitySumbmissionNodeScheduler is UnitySumbmissionEntityViewScheduler
    • INodeBuilder to IEntityViewBuilder (you should use the GenericEntityDescriptor/MixedEntityDescriptor though)
    • NodeBuilder to EntityViewBuilder  (you should use the GenericEntityDescriptor/MixedEntityDescriptor though)
  •  ICallBackOnAddEngine doesn’t exist anymore and it has been merged into the IQueryingEntityViewEngine interface. This means that the Ready function must be always implemented in these cases.

N.B.: As long as Svelto.ECS 2.0 stays in to alpha state, do not use in a production environment. It still needs to be heavily tested and I will need to write some unit tests too. However you are invited to start experimenting with it and leave some feedback.

How Svelto.ECS + Svelto.Task help writing Data Oriented, Cache Friendly, Multi-Threaded code

Note: this article assumes that the reader knows how to use Svelto.ECS and Svelto.Tasks although the findings are interesting regardless.


New exciting features are coming to Svelto.ECS and Svelto.Tasks libraries. As I am currently focused on optimizing Robocraft for Xbox One, I added several functionalities that can help making our game faster. Hence I decided to write a simple example to show some od them. I have to be honest, it’s very hard to think about a simple example that makes sense. Every game has its own needs and what makes sense for one couldn’t be that meaningful for others. However everything boils down to the fact that I added features to exploit data locality (CPU cache) and easily write multi-threaded code. I have already discussed about how to use Svelto.Tasks multi-threaded ITaskRoutine, so I hope I can now show how to use them with Svelto.ECS. Spoiler alert: this article only won’t be sufficient to show you the potentiality of the library, as being more thorough would make this post too long and complex, therefore I encourage you to experiment on your own or wait for my next articles (that come every six months or so :P). This article also assumes that you know the basic concepts behind Svelto.ECS and Svelto.Tasks.

Initially I wanted to write an example that would mimic the boids demo showed at the unite roadmap talk that you can briefly see on this unity blog post. I soon decided to stop going down that route because it’s obvious that Joachim took advance of the new thread safe transform class or even wrote a new rendering pipeline on purpose for this demo, as the standard Transform class and even the GameObject based pipeline would be the biggest bottleneck impossible to parallelize (Note: I eventually found a work around to this problem to show visually the power of the multi core Svelto.Tasks approach). Since I believe that massive parallelism makes more sense on the GPU and that multi-threading should be exploited on CPU in a different way, I decided to not invest more time on it. However as I had some interesting results, I will use what is left of this useless example I wrote to show you what potential gain in milliseconds you may get using Svelto.ECS and Svelto.Tasks. I will eventually discuss how this approach can potentially be used in a real life scenario too.

Svelto.ECS and Svelto.Tasks have a symbiotic relationship. Svelto.ECS allows to naturally encapsulate logic running on several entities and treat each engine flow as a separate “fiber“. Svelto.Tasks allows to run these independent “fibers” asynchronously even on other threads.

GPUs work in a totally different fashion than CPUs and operative systems take the burden to handle how processes must share the few cores usually available. While on a GPU we talk about hundreds of cores, on a CPU we can have usually only 2-12 cores that have to run thousands of threads. However each core can run only one thread at time and it’s thanks to the OS scheduling system that all these threads can actually share the CPU power. More threads run, more difficult is the OS task to decide which threads to run and when. That’s why massive parallelism doesn’t make much sense on CPU. At full capacity, you can’t physically run more threads than cores, hence multi-threading on CPU is not meant for intensive operations.

The Example

You can proceed now downloading the new Svelto.ECS cache example and open the scene under the Svelto-ECS-Parallelism-Example folder. Since I removed everything from the original demo I was building, I can focus on the code instructions only and that’s why this example is the simplest ever I made so far, as it has only one Engine and one Node. The demonstration will show four different levels of optimization, using different techniques and how is possible to make the same instructions run several times faster. In order to show the optimization steps, I decided to use some ugly defines (sorry, I can’t really spend too much time on these exercises), therefore we need to stop the execution of the demo in the editor and change the define symbols every time we want to proceed to the next step. Alternative you can build a separate executable for each combination of defines, so you won’t have to worry about the editor overhead during the profiling. The first set of defines will be: FIRST_TIER_EXAMPLE;PROFILER.

Let’s open and analyse the code. As usual you can find our standard Context and relative composition root (MainContextParallel and ParallelCompositionRoot). Inside the context initialization, a BoidEngine is created (I didn’t bother renaming the classes) as well as 256,000 entities. Seems a lot, but we are going to run very simple operations on them, nothing to do with a real flock algorithm. Actually the instructions that run are basically random and totally meaningless. Feel free to change them as you wish.

The operations that must be executed are written inside the BoidEnumerator class. While the code example is ugly, I wrote it to be allocation free to be as fast as possible. A preallocated IEnumerator can be reused for each frame. This enumerator doesn’t yield at all, running synchronously, and operates on a set of entities that must be defined beforehand. The set of operations to run, as already said, are totally meaningless and written just to show how much time very simple operations can take to run on a large set of entities.

The main loop logic, enclosed in the Update enumerator, will run until the execution is stopped. It’s running inside a standard Update Scheduler because I am assuming that the result of the operations must be available every frame. In the way it’s designed, the main thread cannot be faster than the execution of the operations even if they run on other threads. This is very easily achievable with Svelto.Tasks.

The Naive Code (Editor 113ms, client 58ms)

So, assuming you didn’t get too much confused by the things happening under the hood, running the example will show you that the operations takes several milliseconds to be executed (use the stats window if you use the editor). in my case, on my I7 CPU, running the FIRST_TIER_EXAMPLE operations takes around 113ms. 

this is how the code looks at the moment:

Let’s have a look a the MSIL code generated

this set of operations runs 4 times for each entity; As I stated before, they are meaningless and random, it’s just a set of common and simple operations to run.

The Callless Code (Editor 57ms, Client 23ms)

Let’s now switch to SECOND_TIER_EXAMPLE, changing the defines in the editor and replacing FIRST_TIER_EXAMPLE. Let the code recompile. Exactly the same operations, but with different instructions, now take around 65ms (client 24ms)…what changed? I simply instructed the compiler to not use the Vector3 methods, running the operations directly on the components.

The code looks like:

It appears to me that the main difference is the number of call/callvirt executed, which obviously involves several extra operations. All the calls involve saving the current registries and some variables on the stack, call virt involves an extra pointer dereference to get the address of the function to call. More importantly, we avoid the copy of the Vector3 struct due to the fact that the method results are passed through the stack.

Let’s quickly switch to THIRD_TIER_EXAMPLE. The code now runs at around 57ms (client 23ms, almost no difference, interesting), but the code didn’t change at all. What changed then? I simply exploited the FasterList method to return the array used by the list. This removed another extra callvirt, as the operator[] of a collection like FasterList (but same goes for the standard List) is actually a method call. An Array instead is known directly by the compiler as a contiguous portion of allocated memory, therefore knows how to treat it efficiently.

The Cache Friendly Code (Editor 24ms, Client 16ms)

Have you have heard how ECS design can easily enable data oriented code? This is what the FOURTH_TIER_EXAMPLE define is about. The code now runs at around 24ms, another big jump, but again it seems that the instructions didn’t change much. Let’s see them:

It’s true that now the code is call free, but would this justifies the big jump? The answer is no. Just removing the last call left would have saved around 10ms only, while now the procedure is more than 30ms faster. The big saving must be found in the new EntityViewStruct feature that is enabled by the FOURTH_TIER_EXAMPLE enables. Svelto.ECS allows now to build EntityViews as class and as struct. When they are built as struct, you can exploit the full power of writing cache friendly data oriented code.

If you don’t know how data is laid out in memory, well, stop here and study how data structures work. If you know c++ you are already advantaged, because you know how pointers work, but otherwise, you must know that you can write contiguous data in memory only with array of value types, including struct, assuming that these structs contain only value types! Svelto ECS now allows to store EntityViews either as classes and structs, but when structs are used, the EntityViews are always stored as simple array of structs, so that you can achieve the fastest result, depending how you design your structs (Structure of Arrays or Array of Structures, it’s up to you!)

To put it simply, when you use references, the memory area pointed by the reference, is very likely nowhere in the memory near where the reference itself is. This means that the CPU cannot exploit the cache to read contiguous data. Array of value types (which can be struct containing valuetypes) are instead always stored contiguously.

Cache is what makes the procedure much faster. Nowadays CPUs can read a continuous block of data up to 64 bytes in one single memory access from the L1 cache (which takes around 4 cycles of clock). The L1 cache is efficiently filled by the CPU on data access, caching 32KB of consecutive data. As long as the data you need to access is in those 32KB of cache, the instructions will run much faster. Missing data from the cache will initialize a very slow RAM memory access, which can be hundreds of cycles slow! There are many articles around talking about modern CPU architecture and how data oriented code takes full advantage of it, so I won’t spend more time, but I will link some resources once I have the time, so check the end of the article for new edit every now and then.

Caching is weird though. For example it could come natural to store the whole EntityViewStruct in a local variable, while this seems a harmless instruction, in a data oriented context it can be disaster! How does removing the storing to a local variable of the struct make it faster? Well this is where things get very tricky. The reason is that the contiguous amount of data you need to read must be not too big and must be able to be read in one memory access (32 to 64bytes, depending the CPU).

Since our BoidNode struct is quite small, caching the variable here actually would have made the execution just slightly slower. However if you make the Boidnode struct larger (Try to add other four Vector3 fields and cache the whole structure in a local variable), you will wreck the processor and the execution will become largely slower! (edit: this problem could be solved with the new Ref feature of c# 7 which, unluckily, is still not supported by Unity)

Instead accessing directly the single component won’t enable the struct-wide read due to the copy to a local variable and since x,y,z are read and cached at once, these instructions will run at the same speed regardless the size of the struct. Alternatively you can cache locally just the position Vector3 in a local variable, which won’t make it faster, but it will be still work fast regardless the size of the struct.

The Multi Threaded Cache Friendly Code (Editor 8ms, Client 3ms)

To conclude, let’s keep the FOURTH_TIER_EXAMPLE define, but add a new one, called TURBO_EXAMPLE. The code now runs at around 8ms. This because the new MultiThreadedParallelTaskCollection Svelto.Tasks feature is now enabled. The operations, instead to run on one thread, are split and run on 8 threads. As you already figured out, 8 threads doesn’t mean 8 times faster (unless you actually have 8 cores :)) and this is the reality. Splitting the operations over multiple threads doesn’t only give sub-linear gains, but also diminishing return, as more the threads, less and less faster will the operations run, until increasing threads won’t make any difference. This is due to what I was explaining earlier. Multi-threading is no magic. Physically your CPU cannot run more threads than its cores and this is true for all the threads of all the processes running at the same time on your Operative System. That’s why CPU multi-threading makes more sense for asynchronous operations that can run over time or operations that involve waiting for external sources (sockets, disks and so on), so that the thread can pause until the data is received, leaving the space to other threads to run meanwhile. Memory access is also a big bottleneck, especially when cache missing is involved (the CPU may even decide to switch to other threads while waiting for the memory to be read, this is what Hyperthreading actually does)

This is how I use the MultiThreadedParallelTaskCollection in the example:

This collection allows to add Enumerators to be executed later on multiple threads. The number of threads to activate is passed by constructor. It’s interesting to note that the number of enumerators to run is independent by the number of threads, although in this case they are mapped 1 to 1. The MultiThreadedParallelTaskCollection, being an IEnumerator, can be scheduler on whatever runner, but the sets of tasks will always run on their own, pre-activated, threads.

The way I am using threading in this example is not the way it should be used in real life. First of all I am actually blocking the main thread to wait for the other threads to finish, so that I can actually measure how long the threads take to finish the operations. In a real life scenario, the code shouldn’t be written to wait for the threads to finish. For example, handling AI could run independently by the frame rate. I am thinking about several way to manage synchronization so that will be possible not just to exploit continuation between threads, but even run tasks on different threads at the same time and synchronize them. WaitForSignalEnumerator is an example of what I have in mind, more will come.

All right, we are at the end of the article now. I need to repeat myself here: this article doesn’t show really what is possible to do with Svelto.ECS and Svelto.Tasks in its entirety, this is just a glimpse of the possibilities opened by this infrastructure. Also the purpose of this article wasn’t about showing the level of complexity that is possible to handle easily with the Svelto framework, but just to show you how important is to know how to optimize our code. The most important optimizations are first the algorithmic ones, then the data structure related ones and eventually the ones at instruction level. Multi-threading is not about optimization, but instead being able to actually exploit the power of the CPU used. I also tried to highlight the CPU threading is not about massive parallelism and GPUs should be used for this purpose instead.

The Importance of the Compiler (Client 1.5ms!!)

I started to wonder, what if I had the chance to use a good c++ compiler to see if it could do a better work with auto-vectorization? After all, we are aware of the fact that the JIT compiler can’t do miracles in this sense. Since IL2CPP is not available for Windows platform, I compiled the same code for UWP, using the IL2CPP option. I guess the results are pretty clear, IL2CPP produces an executable that is twice as fast as the c# version. Since there isn’t any call to the native code, can’t be because of the benefit due to the lack of marshalling. I haven’t had the time to verify yet what the real difference is, but auto-vectorization may be the reason.


While this article is implicitly (and strongly) advising you to follow the Entity-Component-System pattern to centralize and modularize the logic of your game entities and gives another example of Svelto.Tasks usage to easily exploit multithreading, it has the main goal to show you why is needed to pay attention to how the code is written even if we talk about c# and Unity. While the effectiveness of these extreme optimizations largely depend on the context, understanding how the compiler generates the final code is necessary to not degrade the performance of large projects.

Having a light knowledge of modern CPUs architectures (like I do) helps to understand how to better exploit the CPU cache and optimize memory accesses. Multi-threading really needs more separate articles, but I now can state with certainty that exploiting the full power of the CPU with Unity is more than feasible, although the biggest bottleneck will remain the rendering pipeline itself.

To be honest, most of the Unity optimization issues are often related to the marshalling that the Unity framework performs when is time to call the Unity native functions. Unluckily these functions are many and are called very often, becoming the biggest issue we found during our optimizations. GameObject hierarchy complexity, calling Transform functions multiple times and similar problems can really kill the performance of your product. I may talk about these finding in future with a separate article, but many have been discussed already by other developers. The real goal to pursue is to find the right balance of what you can do in c# and what must be left to Unity. The ECS pattern can help with this a lot, avoiding the use of MonoBehaviour functionalities when they are not strictly necessary.

For example, Iet’s try to instantiate 256k GameObjects (yes it’s possible to do) and add a Monobehaviour each that simply runs the fastest version of the test. On my PC, it runs at 105ms, and even if profiling with Unity doesn’t give the best accuracy, it seems that half of this time is spent for Marshalling (very likely the time between switching between c# and c++ for each update, my current assumption is that Monobehaviours are stored as native objects and they need to switch to managed code for each single Update called, this is just crazy!).

Final Results:
Editor Client
ECS Naive Code 113ms 58ms
ECS No Calls Involved 57ms 23ms
ECS Structs only 24ms 16ms
ECS Multithreading (8 threads) 8ms 3ms
ECS Multithreading UWP/IL2CPP (8 threads) n/a 1.5ms
Classic GameObject/Update approach (no calls inside) 105ms 45ms
Classic GameObject/Update approach (no calls inside) UWP n/a 22ms


  • I crammed too many points in this article 🙂
  • Optimize your algorithms first, you don’t want any quadratic algoritm in your code. Logarithmic is the way to go.
  • Optimize your datastructures next, you must know how your data structures work in memory to efficiently use them. There is difference between an array, a list and a linkedlist!
  • Optimize your instructions paying attention to memory accesses and caching for last (if you want to squeeze those MS OUT!). Coding to exploit the CPU cache is as important as using multi-threading. Of course not all the Engines need to be so optimized, so it depends by the situation. I will link some good references when I have the time, as there is much more to discuss about this point.
  • When using Unity, be very careful to minimize the usage of wrappers to native functions. These are many and often can be easily identified in Visual Studio navigating to the declaration of the function.
  • Mind that Unity callbacks (Update, LateUpdate etc) are often Marshalled from native code, making them slow to use. The Unity hierarchical profiler tells you half the truth, showing the time taken inside the Update, but not between Updates (usually shown on the “others” graph). The Timeline view will show “gaps” between Update calls, which is the time used for Marshalling them.
  • Multithreading is tricky but Svelto.Task can massively help
  • ECS pattern allows to write code that exploits temporal and spatial locality (As you can manage all your entities in one central place). Try Svelto.ECS to see what you can do 😉
  • ECS pattern will also help you to minimize the use of Unity functionalities which most of the time involve costly marshalling operations to switch to the native code. Most of them are not really justified and due to old legacy code that has never been updated in 10 years!
  • I understand that is hard to relate to an example that iterates over 256k entities. However moving logic from Monobehaviours to ECS engines saved dozens of milliseconds in the execution of the real-life code of our projects.
  • Why don’t they finish IL2CPP for standalone platforms? At least windows only!
  • Don’t use the editor to profile 🙂

Please leave here your thoughts, I will probably expand this article with my new findings, especially the ones related to the cache optimizations.

For more Multi-threading shenanigans check my new article: Learning Svelto Tasks by example: The Million Points example 

External resources:

Svelto TaskRunner – Run Serial and Parallel Asynchronous Tasks in Unity3D

Note: this is an on-going article and is updated with the new features introduced over the time. 

In this article I will introduce a better way to run coroutines using the Svelto.Tasks TaskRunner. I will also show how to run coroutine between threads, easily and safely. You can finally exploit the power of your processors, even if you don’t know much about multithreading. If you use Unity, you will be surprised about how simple is to pass results computed from the multithreaded coroutines to the main thread coroutines.

What we got already: Unity and StartCoroutine

If you are a Unity developer, chances are you know already how StartCoroutine works and how it exploits the powerful c# yield keyword to time slice complex routines. A coroutine is a quite handy and clean way to execute procedures over time or better, asynchronous tasks.

Lately Unity improved the support of Coroutines and new fancy things are now possible to achieve. For example, it was already possible to run tasks in serial doing something like:

And apparently* it was also possible to exploit a basic form of continuation starting a coroutine from another coroutine:

*apparently because I never tried this on unity 4 and I wasn’t aware at that time that it was possible.

However lately it’s also possible to return an IEnumerator directly from another IEnumerator without running a new coroutine, which is actually almost 3 times faster, in terms of overhead, than the previous method:

Run parallel routines is also possible, however there is no elegant way to exploit continuation when multiple StartCoroutine happen at once. Basically there is no simple way to know when all the coroutines are completed.

I should add that Unity tried to extend the functionality of the Coroutines introducing new concepts like the CustomYieldInstruction, however it fails to create a tool that can be used to solve more complex problems in a simple and elegant way, problems like, but not limited to, running several sets of parallel and serial asynchronous tasks.

Introducing Svelto.Tasks

Being limited by what Unity can achieve, a couple of years ago I started to work on my TaskRunner library and spent the last few months to evolve it in something more powerful and interesting. The set of use cases that the TaskRunner can now solve elegantly, is quite broad, but before to show a subset of them as example, I will list the main reasons why TaskRunner should be used in place of StartCoroutine:

  • you can use the TaskRunner everywhere, you don’t need to be in a Monobehaviour. The whole Svelto framework focuses on shifting the Unity programming paradigm from the use of the Monobehaviour class to more modern and flexible patterns.
  • you can use the TaskRunner to run Serial and Parallel tasks, exploiting continuation without needing to use callbacks.
  • you can pause, resume and stop whatever set of tasks running.
  • you can catch exceptions from whatever set of tasks running.
  • you can pass parameters to whatever set of tasks running.
  • you can exploit continuation between threads (!).
  • Whatever the number of tasks you are running is, the TaskRunner will always run just one Unity coroutine (with some exceptions).
  • you can run tasks on different schedulers (including schedulers on different threads!).
  • you can transform whatever asynchronous operation into a task, thanks to the ITask interface.

A subset of use cases that the TaskRunner is capable to handle, is what I am going to show you soon, and I am sure you will be surprised by some of them :). TaskRunner can be used in multiple ways and, while the performance doesn’t change much between methods, you would fully exploit the power of the library only knowing when to use what. Let’s start:

The simplest way to use the TaskRunner is to use the function Run, passing whatever IEnumerator in it.

This simply does what says on the tin. It’s very similar to the StartCoroutine function, but it can be called from everywhere. Just pay attention to the fact that
TaskRunner uses the not generic IEnumerator underneath, so using generic IEnumerator, with a value type as parameter, will always result in boxing (as shown in the example above).

TaskRunner can be also used to run every combination of serial and parallel tasks in this way:

But why not exploit the IEnumerator continuation? It’s more elegant than using callbacks and we don’t need to use a SerialTaskCollection explicitly (with no loss of performance). We won’t even need to use two ParallelTasks:

if you feel fancy, you can also use the extension methods provided:

Svelto.Tasks and Unity compatibility

you are used to yield special objects like WWW, WaitForSeconds or WaitForEndOfFrame. Those functions are not enumerators and they work because Unity is able to recognize them and run special functions accordingly. For example, when you return WWW, Unity will run a background thread to execute the http request. If WWW is not able to reach Unity framework, it will never be able to run properly. For this reason, the MainThreadRunner is actually compatible with all the Unity functions. You can yield them, however there are limitations: you cannot yield them, as they are, from a ParallelTaskCollection. If you do it, the ParallelTaskCollection will stop executing and will wait Unity to return the result, effectively loosing the benefits of the process. Whenever you return a Unity special async function from inside a ParallelTaskCollection, you’ll need to wrap it inside an IEnumerator if you want to take advantage of the parallel execution. This is the reason why WWWEnumeratorWaitForSecondsEnumerator and AsyncOperationEnumerator exist.

TaskRoutines and Promises

When c# coders think about asynchronous tasks, they think about the .net Task Library. The Task Library is an example of an easy to use tool that can be used to solve very complex problems. The main reason why the Task Library is so flexible, is because it’s Promises compliant. While the Promises design has been proved proficient through many libraries, it can also be implemented in several ways, but in every case, what makes the promises powerful, is the idea to implement continuation without using messy events all over the code.

In Sveto.Tasks, the promises pattern is implemented through the ITaskRoutine interface.  let’s see how it works: an ITaskRoutine is a coroutine already prepared and ready to start at your command. To create a new ITaskRoutine simply run this function:

Since an allocation actually happens, it’s best to preallocate and prepare a routine during the initialization phase and run it during the execution phase. A task routine can also be reused, changing all the parameters, before to run it again.  Running an empty ITaskRoutine will result in an exception thrown, so we need to prepare it first. You can do something like:

In this case I used the function SetEnumeratorProvider instead of SetEnumerator. In this way the Task Runner is able to recreate the enumerator in case you want to start the same function multiple times. Let’s see what we can do:

We can Start the routine like this using

We can Pause the routine using

We can Resume the routine using

we can Stop the routine using (it’s getting tedious)

we can Restart the routine using

You can check the ExampleTaskRoutine example out to see how it works.

Let’s see how ITaskRoutine are compliant with Promises. As we have seen, we can pipe serial and/or parallel tasks and exploit continuation. We can get the result from the previous enumerator as well, using the current properties. We can pass parameters, through the enumerator function itself. The only feature we haven’t seen yet is how to handle failures, which is obviously possible too.

For the failure case I used an approach similar to the .net Task library. You can either stop a routine from a routine, yielding Break.It; or throwing an exception. All the exceptions, including the ones threw on purpose, will interrupt the execution of the current coroutine chain. Let’s see how to handle both cases with some, not so practical, examples.

In the example above we can see several new concepts. First of all, it shows how to use the Start() method providing what to execute when the ITaskRoutine is stopped or if an exception is thrown from inside the routine. It shows how to yield Break.It to emulate the Race function of the promises pattern. Break.It is not like returning yield break, it will actually break the whole coroutine from where the current enumerator has been generated. At last it shows how to yield an array of IEnumerator as syntactic sugar in place of the normal Parallel Task generation. Just to be precise, OnStop will NOT be called when the task routine completes, it will be called only when ITaskRoutine Stop() or yield Break.It are used.


Break.it will now break the current running task collection. This means that if you run Break.It inside a ParallelTaskCollection or SerialTaskCollection it will break the current collection only and not the whole ITaskRoutine. In this case Stop() won’t be called, but the TaskCollection completes. This is how to use Break.It in a real life scenario:


Now let’s talk about something quite interesting: the schedulers. So far we have seen our tasks running always on the standard scheduler, which is the Unity main thread scheduler. However you are able to define your own scheduler and you can run the task whenever you want! For example, you may want to run tasks during the LateUpdate or the PhysicUpdate. In this case you may implement your own IRunner scheduler or even inherit from MonoRunner and run the StartCoroutineInternal as a callback inside the Monobehaviour that will drive the LateUpdate or PhysicUpdate. Using a different scheduler than the default one is pretty straightforward:


Multithread and Continuation between threads

But what if I tell you that you can run tasks also on other threads? Yep that’s right, your scheduler can run on another thread as well and, in fact, one Multithreaded scheduler is already available. However you may wonder, what would be the practical way to use a multithreaded scheduler? Well, let’s spend some time on it, since what I came out with, is actually quite intriguing. Caution, we are now stepping in the twilight zone.

First of all, all the features so far mentioned work on whatever scheduler you put them on. This is fundamental in the design, however some limitations may be present due to the Unity not thread safe nature. For example, the MultiThreadRunner, won’t be able to detect special Unity coroutines, like WWW, AsyncOperation or YieldInstruction, which is obvious, since they cannot run on anything else than the main thread. You may wonder what the point of using a MultiThreadRunner is, if eventually it cannot be used with Unity functions. The answer is continuation! With continuation you can achieve pretty sweet effects.

Let’s see an example, enabling the PerformanceMT GameObject from the scene included in the library code. It compares the same code running on a normal StartCoroutine (SpawnObjects) and on another thread (SpawnObjectsMT). Enable only the MonoBehaviour you want test to compare the performance. What’s happening? Both MBs spawn 150 spheres that will move along random directions. In both cases, a slow coroutine runs. The coroutine goal is to compute the largest prime number smaller than a given random value between 0 and 1000; The result will be used to compute the current sphere color which will be updated as soon as it’s ready. The following is the multithreaded version:

Well I hope it’s clear to you at glance. First we run CalculateAndShowNumber on the multiThreadScheduler. We use the same MultiThreadRunner instance for all the game objects, because otherwise we would spawn a new thread for each sphere and we don’t want that. One extra thread is more than enough (I will spend few words on it in a bit).

FindPrimeNumber is supposed to be a slow function, which it is. As a matter of fact, if you run the single threaded version (enabling the SpawnObject monobehaviour instead of SpawnObjectMT) you will notice that the frame rate is fairly slow. In fact, the GPU must wait the CPU to compute the prime number.

The Multithreaded version runs the main enumerator on another thread, but how can the color be set since it’s impossible to use the Renderer component from anything else than the main thread? This is where a bit of magic happens. Returning the enumerator from a task, running on another scheduler, will actually continue its execution on that scheduler. You may think that at this point the thread will wait for the enumerator running on the main thread to continue. This is partially true, since differently than a Thread.Join(), the thread is actually not stalled, it will continue yielding, so if other tasks are running on the same thread, they will be actually processed. At the end of the main thread enumerator, the path will return to the other thread and continue from there. Quite fascinating, I’d say, also because you could expect great difference in performance.

So, We have seen some advanced applications of the TaskRunner using different threads, but since Unity will soon support c#6, you could wonder why to use the Svelto TaskRunner instead of the Task.Net library.  Well, they serve two different purposes. Task.Net library has not been designed for applications that could run heavy routines on threads. The Task.Net and the await/async keywords heavily exploit the Thread.Pool to use as many threads as possible, with the condition that most of the time, these threads are very short-lived. Basically it’s designed to serve applications that run hundreds of short lived and light asynchronous tasks.  This is usually true when we talk about server applications.

For games instead, what we generally need, are few threads where to run heavy operations that can go in parallel with the main thread and this is what the TaskRunner has been designed for. You will also be sure that all the routines running on the same MultiThreadRunner scheduler instance, won’t occur in any concurrency issue. In fact, you may see every Multithread runner as a Fiber (if you feel brave, you can also check this very interesting video). It’s also worth to notice that the MultiThreadRunner will keep the thread alive as long as it is actually used, effectively letting the Thread.Pool (used underneath) to manage the threads properly.

Other stuff…

To complete this article, I will spend few words on other two minor features. As previously mentioned, the TaskRunner will identify IAbstractTask implementations as well. Usually you will need to implement an ITask interface to be useful. The ITask is meant to transform whatever class in a task that can be executed through the task runner. For example, it could be used to run web services, which result will be yielded on the main thread.

The ITaskChain interface is something still at its early stage and it could be changed or deprecated in future. It could be useful to know parameters that are passed through tasks, like a sort of Chain Of Responsibility pattern.

The very last thing to take in consideration is the compatibility with Unity WWW, AsyncOperation and YieldInstruction objects. As long as you are not using parallel task, you can yield them from your enumerator and they will work as you expect! However you cannot use them from a parallel collection unless you wrap them in another IEnumerator, that’s why WWWEnumerator and AsyncOperationEnumerator classes exist.

The source code and examples are available from here: https://github.com/sebas77/Svelto.Task. Every feedback is more than welcome. Please let me know if you find any bug!

Notes on Optimizations

TaskRunner has been designed with optimizations in mind. Without counting the multithreaded runner, extensive use of TaskRunner will actually results in performance increase over the normal use of StartCoroutine and the normal Update functions. MonoRunner works using one single unity coroutine (Except in the few cases when they are handled to Unity) for all the performing tasks., this will eliminate all the overhead needed to run hundreds of separate Updates. TaskRunner is also very useful to run time-slicing tasks.

03/11/16 Notes

I realised that MonoRunner was behaving differently than the Unity StartCoroutine function since the latter runs the enumerator immediately, while MonoRunner was waiting for the next available slot. I changed its behaviour now, but this meant to introduce a not so elegant ThreadSafe version of every Run function.

08/01/17 Notes

Some minor changes and improvement of examples

  • Added a Editor Profiler to keep track of the tasks performance
  • WWW is not handed to Unity anymore, yield it always through WWWEnumerator
  • To avoid confusion, IEnumerable cannot be yielded directly anymore

22/10/17 Notes

A ton of features have been introduced and I have not been diligent enough to keep track of them properly. I will list the highlights:

  • Surely I have introduced some breaking changes, but they will be simple to fix
  • Massively improved the multi-threading related features, I know I need to write a good example on how to use them
  • The unit tests now run through the official Unity tests runner
  • improved examples and unit tests
  • Pausing/Resuming/Stopping/Starting ITaskroutines now works better and makes more sense
  • The profiler now can recognize tasks running on other threads and profile them (very cool)
  • Rewritten the MultiThreadedParallelTaskCollection and now is usable. A good example has been written as well and you can found it here.
  • yield break will stop the current Enumeration only, yield Break.It stops the current TaskCollection, yield.BreakAndStop stops the whole ITaskRoutine and triggers the stop callback
  • several optimizations
  • Important: TaskCollection logic has been rewritten to be able to reuse them after the enumeration ends (They can restart). In order to Clear them the new Clear function is added as Reset won’t clear them anymore
  • Added LateMonoRunner and UpdateMonoRunner (you can guess when they run 🙂 )
  • Added StaggeredMonoRunner. It allows to spread tasks over N frames
  • It’s now possible to create several MonoRunners. In this way you can manage and leave the “standard” ones alone when you need to do weird stuff
  • Breaking change: the TaskCollection doesn’t accept Enumerables anymore. This is because it was bad to hide the GetEnumerator() allocation that should have been done anyway.
  • The code after any WaitForEndOfFrameWaitForFixedUpdate and similar will be now executed when it’s expected

21/01/18 Notes

Check out the new articles:

How Svelto.ECS + Svelto.Task help writing Data Oriented, Cache Friendly, Multi-Threaded code


Svelto ECS is now production ready

Note: this article is now outdated as Svelto.ECS 2.0 is out. Read it to have a complete picture on the concepts, but the most up to date information can be found now here.

Note: in this article you will meet the term Node, which has been superseded by the term EntityView.

Six months passed since my last article about Inversion of Control and Entity Component System and meanwhile I had enough time to test and refine my theories, which have been proven to work well in a production environment. As result Svelto.ECS framework is now ready to use in a professional setup.

Let’s start from the basics, explaining why I use the word framework. My previous work, Svelto.IoC, cannot be defined as a framework; it’s instead a tool. As a tool, it helps to perform Dependency Injection and design your code following the Inversion of Control principle. Svelto.IoC is nothing else, therefore whatever you do with it, you can do without it in a very similar way, without changing much the design of your code. As such, svelto IoC doesn’t dictate a way of coding, doesn’t give a paradigm to follow, leaving to the coder the burden to decide how to design their game infrastructure.

Svelto.ECS won’t take the place of Svelto.IoC, however it can be surely used without it. My intention is to add some more features to my IoC container in future and also write another article about how to use it properly, but as of now, I would not suggest to use it, unless you are aware of what Inversion of Control actually means and that you are actually looking for an IoC container that well integrates with your existing design (like, for example, a MVP based GUI framework). Without understanding the principles behind IoC, using an IoC container can result in very dangerous practices. In fact I have seen smell code and bad design implementations that actually have been assisted by the use of an IoC container. These ugly results would have been much harder to achieve without it.

Svelto.ECS is a framework and with it you are forced to design your code following its specifications. It’s the framework to dictate the way you have to write your game infrastructure. The framework is also relatively rigid, forcing the coder to follow a well defined pattern. Using this kind of framework is important for a medium-large team, because coders have different levels of experience and it is silly to expect from everyone to be able to design well structured code. Using a framework, which implicitly solves the problem to design the code, will let the team to focus on the implementation of algorithms, knowing, with a given degree of certainty, that the amount of refactoring needed in future will be less than it would have been if left on their own.

The first error I made with the pre-release implementation of this framework, was to not define clearly the concept of Entity. I noticed that it is very important to know what the elements of the project are before to write the relative code. Entities, Components, Engines and Nodes must be properly identified in order to write code faster. An Entity is actually a fundamental and well defined element in the game. The Engines must be designed to manage Entities and not Components. Characters, enemies, bullets, weapons, obstacles, bonuses, GUI elements, they are all Entities. It’s easier, and surely common, to identify an Entity with a view, but a view is not necessary for an Entity. As long as the entity is well defined and makes sense for the logic of the game, it can be managed by engines regardless its representation.

In order to give the right importance to the concept of Entity, Nodes cannot be injected anymore into Engines until an Entity is explicitly built. An Entity is now built through the new BuildEntity function and the same function is used to give an ID to the Entity. The ID doesn’t need to be globally unique, this is a decision left to the coder. For example as long as the ID of all the bullets are unique, it doesn’t matter if the same IDs are used for other entities managed by other engines.

The Setup

Following the example, we start from analyzing the MainContext class. The concept of Composition Root, inherited from the Svelto IoC container, is still of fundamental importance for this framework to work. It’s needed to give back to the coder the responsibility of creating objects (which is currently the greatest problem of the original Unity framework, as explained in my other articles) and that’s what the Composition Root is there for. By the way, before to proceed, just a note about the terminology: while I call it Svelto Entity Component System framework, I don’t use the term System, I prefer the term Engine, but they are the same thing.

In the composition root we can create our ECS framework main Engines, Entities, Factories and Observers instances needed by the entire context and inject them around (a project can have multiple contexts and this is an important concept).

In our example, it’s clear that the amount of objects needed to run the demo is actually quite limited:

The EngineRoot is the main Svelto ECS class and it’s used to manage the Engines. If it’s seen as IEntityFactory, it will be used to build Entities. IEntityFactory can be injected inside Engines or other Factories in order to create Entities dynamically in run-time.

It’s easy to identify the Engines that manage the Player Entity, the Player Gun Entity and the Enemies Entities. The HealthEngine is an example of generic Engine logic shared between specialized Entities. Both Enemies and Player have Health, so their health can be managed by the same engine. DamageSoundEngine is also a generic Engine and deals with the sounds due to damage inflicted or death. HudEngine and ScoreEngine manage instead the on screen FX and GUI elements.

I guess so far the code is quite clear and invites you to know more about it. Indeed, while I explained what an Entity is, I haven’t given a clear definition of Engine, Node and Component yet.

An Entity is defined through its Components interfaces and the Implementors of those components. A Svelto ECS Component defines a shareable contextualized collection of data. A Component can exclusively hold data (through properties) and nothing else. How the data is contextualized (and then grouped in components) depends by how the entity can been seen externally. While it’s important to define conceptually an Entity, an Entity does not exist as object in the framework and it’s seen externally only through its components. More generic the components are, more reusable they will be. If two Entities need health, you can create one IHealthComponent, which will be implemented by two different Implementers. Other examples of generic Components are IPositionComponent, ISpeedComponent, IPhysicAttributeComponent and so on. Component interfaces can also help to reinterpret or to access the same data in different ways. For example a Component could only define a getter, while another Component only a setter, for the same data on the same Implementor. All this helps to keep your code more focused and encapsulated. Once the shareable data options are exhausted, you can start to declare more entity specific components. Bigger the project becomes, more shareable components will be found and after a while this process will start to become quite intuitive. Don’t be afraid to create components even just for one property only as long as they make sense and help to share components between entities. More shareable components there are, less of them you will need to create in future. However it’s wise to make this process iterative, so if initially you don’t know how to create components, start with specialized ones and the refactor them later.
The data hold by the components can be used by engine through polling and pushing. This something I have never seen before in other ECS framework and it’s a very powerful feature, I will explain why later, but for the time being, just know that Components can also dispatch events through the DispatcherOnChange and DispatcherOnSet classes (you must see them as a sort of Data Binding feature).

The Implementer concept was existing already in the original framework, but I didn’t formalize it at that time. One main difference between Svelto ECS and other ECS frameworks is that Components are not mapped 1:1 to c# objects. Components are always known through Interfaces. This is quite a powerful concept, because allows Components to be defined by objects that I call Implementers. This doesn’t just allow to create less objects, but most importantly, helps to share data between Components. Several Components can been actually defined by the same Implementer. Implementers can be Monobehaviour, as Entities can be related to GameObjects, but this link is not necessary. However, in our example, you will always find that Implementers are Monobehaviour. This fact actually solves elegantly another important problem, the communication between Unity framework and Svelto ECS.
Most of the Unity framework feedback comes from predefined Monobehaviour functions like OnTriggerEnter and OnTriggerExit. One of the benefit to let Components dispatch events, is that the communication between the Implementer and the Engines become seamless. The Implementer as Monobehaviour dispatches an IComponent event when OnTriggerEnter is called. The Engines will then be notified and act on it.

It’s fundamental to keep in mind that Implementers MUST NOT DEFINE ANY LOGIC. They must ONLY implement component interfaces, hold the relative states and act as a bridge between Unity and the ECS framework in case they happen to be also Monobehaviours.

So far we understood that: Components are a collection of Data, Implementers define the Components and don’t have any logic. Hence where all the game logic will be? All the game logic will be written in Engines and Engines only. That’s what they are there for. As a coder, you won’t need to create any other class. You may need to create new Abstract Data Types to be used through components, but you won’t need to use any other pattern to handle logic. In fact, while I still prefer MVP patterns to handle GUIs, without a proper MVP framework, it won’t make much difference to use Engines instead. Engines are basically presenters and Nodes hold views and models.

OK, before to proceed further with the theory, let’s go back to the code. We have easily defined Engines, it’s now time to see how to build Entities. Entities are injected into the framework through Entity Descriptors. EntityDescriptor describes entities through their Implementers and present them to the framework through Nodes. For example, every single Enemy Entity, is defined by the following EnemyEntityDescriptor:

This means that an Enemy Entity and its components will be introduced to the framework through the nodes EnemyNode, PlayerTargetNode and HealthNode and the method BuildEntity used in this way:

While EnemyImplementor implements the EnemyComponents, the EnemyEntityDescriptor introduces them to the Engines through the relative Enemy Nodes. Wow, it seems that it’s getting complicated, but it’s really not. Once you wrap your head around these concepts, you will see how easy is using them. You understood that Entities exist conceptually, but they are defined only through components which are implemented though Implementers. However you may wonder why nodes are needed as well. While Components are directly linked to Entities, Nodes are directly linked to Engines. Nodes are a way to map Components inside Engines. In this way the design of the Components is decoupled from the Engines. If we had to use Components only, without Nodes, often it could get awkward to design them.
Should you design Components to be shared between Entities or design Components to be used in specific Engines? Nodes rearrange group of components to be easily used inside Engines. Nodes are mapped directly to Engines. specialized Nodes are used inside specialized Engines, generic Nodes are used inside generic Engines. For example: an Enemy Entity is an EnemyNode for Enemy Engines, but it’s also a PlayerTargetNode for the PlayerShootingEngine. The Entity can also be damaged and the HealthNode will be used to let the HealthEngine handle this logic. Being the HealthEngine a generic engine, it doesn’t need to know that the Entity is actually an Enemy. different and totally unrelated engines can see the same entity in totally different ways thanks to the nodes. The Enemy Entity will be injected to all the Engines that handle the EnemyNode, the PlayerTargetNode and the HealthNode Nodes. It’s actually quite important to find Node names that match significantly the name of the relative Engines.

As we have seen, entities can easily be created explicitly in code passing a specific EntityDescriptor to the BuildEntity function. However, some times, it’s more convenient to exploit the polymorphism-like feature given by the IEntityDescriptorHolder interface. Using it, you can attach the information about which EntityDescriptor to use directly on the prefab, so that the EntityDescriptor to use will be directly taken from the prefab, instead to be explicitly specified in code. This can save a ton of code, when you need to build Entities from prefabs.

So we can define multiple IEntityDescriptorHolder implementations like:

and use them to create Entities implicitly, without needing to specify the descriptor to use, since it can be fetched directly from the prefab (i.e.: playerPrefab, enemyPrefab):

Once Engines and Entities are built, the setup of the game is complete and the game is ready to run. The coder can now focus simply on coding the Engines, knowing that the Entities will be injected automatically through the defined nodes. If new engines are created, is possible that new Nodes must be added, which on their turn will be added in the necessary Entity Descriptors.

Note though: with experience I realized that it’s better to use Monobehaviour and Gameobject to build entity ONLY when strictly needed. This means only when you actually need engines to communicate with the Unity engine through the implementors. Avoid this otherwise, for example, do not use Monobehaviours as implementers because your monobehaviours act as data source, just read the data in a different way. Keep your code decoupled from unity as much as you can.

The Logic in the Engines

the necessity to write most of the original boilerplate code has been successfully removed from this version of the Framework. An Important feature that has been added is the concept of IQueryableNodeEngine. The engines that implement this interface, will find injected the property

This new object remove the need to create custom data structures to hold the nodes. Now nodes can be efficiently queried like in this example:

This example also shows how Engines can periodically poll or iterate data from Entities. However often is very useful to manage logic through events. Other ECS frameworks use observers or very awkward “EventBus” objects to solve communication between systems. Event bus is an anti-pattern and the extensive use of observers could lead to messy and redundant code. I actually use observers only to setup communication between Svelto.ECS and legacy frameworks still present in the project. DispatchOnSet and DispatchOnChange are an excellent solution to this awkward problem as they allow Engines to communicate through components, although the event is still seen as data. Alternatively, In order to setup a communication between engines, the new concept of Sequencer has been introduced (I will explain about it later in the notes).


IRemoveEntityComponent needs a dedicated paragraph. This Component is a special Component partially managed by the framework. The user must still need to implement it through an implementer, but all is needed to implement is:

the removeEntity action is automatically set by the framework and once called, it will remove, from all the engines, all the nodes related to the entity created by the BuildEntity function. This is the only way to execute the Remove engine method. A standard example is to use in a Monobehaviour implementer like this:

but can be used also in this way:

the removeEntity delegate will be injected automatically by the framework, so you will be find it ready to use.


I understand that some time is needed to absorb all of this and I understand that it’s a radically different way to think of entities than what Unity taught to us so far. However there are numerous relevant benefits in using this framework beside the ones listed so far. Engines achieve a very important goal, perfect encapsulation. The encapsulation is not even possibly broken by events, because the event communication happens through injected components, not through engine events. Engines must never have public members and must never been injected anywhere, they don’t need to! This allows a very modular and elegant design. Engines can be plugged in and out, refactoring can happen without affecting other code, since the coupling between objects is minimal.

Engines follow the Single Responsibility Principle. Either they are generic or specialized, they must focus on one responsibility only. The smaller they are, the better it is. Don’t be afraid of create Engines, even if they must handle one node only. Engine is the only way to write logic, so they must be used for everything. Engines can contain states, but they never ever can be entity related states, only engine related states.

Once the concepts explained above are well understood, the problem about designing code will become secondary. The coder can focus on implementing the code needed to run the game, more than caring about creating a maintainable infrastructure. The framework will force to create maintainable and modular code. Concept like “Managers”, “Containers”, Singletons will just be a memory of the past.

Some rules to remember:

  • Nodes must be defined by the Engine themselves, this means that every engine comes with a set of node(s). Reusing nodes often smells
  • Engines must never use nodes defined by other engines, unless they are defined in the same specialized namespace. Therefore an EnemyEngine will never use a HudNode.
  • Generic Engines define Generic Nodes, Generic Engines NEVER use specialized nodes. Using specialized nodes inside generic engines is as wrong as down-casting pointers in c++

Please feel free to experiment with it and give me some feedback.

Notes from 21/01/2017 Update:

Svelto ECS has been successfully used in Freejam for some while now on multiple projects. Hence, I had the possibility to analyse some weakness that were leading to some bad practices and fix them. Let’s talk about them:

  1. First of all, my apologies for the mass renames you will be forced to undergo, but I renamed several classes and namespaces.
  2. The Dispatcher class was being severely abused, therefore I decided to take a step back and deprecate it. This class is not part of the framework anymore, however it has been completely replaced by DispatcherOnSet and DispatcherOnChange. The rationale behind it is quite simple, Components must hold ONLY data. The difference is now how this data is retrieved. It can be retrieved through polling (using the get property) or trough pushing (using events). The event based functionality is still there, but it must be approached differently. Events can’t be triggered for the exclusive sake to communicate with another Engine, but always as consequence of data set or changed. In this way the abuse has been noticeably reduced, forcing the user to think about what the events are meant to be about. The class now exposes a get property for the value as well.
  3. The new Sequencer class has been introduced to take care of the cases where pushing data doesn’t fit well for communication between Engines. It’s also needed to easily create sequence of steps between Engines. You will notice that as consequence of a specific event, a chain of engines calls can be triggered. This chain is not simple to follow through the use of simple events. It’s also not simple to branch it or create loops. This is what the Sequencer is for: