Jumping back to JoeJ's post, I think it's a bit unfortunate that it used particles as an example as it skips over the primary sources of the evil cache misses in most CES (component entity systems like Unity). Using a very simple example of the difference, your particles might be stored as “vector<Particle>” while a game entity is likely to be one indirection away “vector<EntityPtr>”. I.e. vector “of” particles versus vector of “pointers” to entities. This is a core issue in many game engines which causes the inability to handle high entity counts. So, why wouldn't the entities be stored directly in the vector, there are several reasons. Just a few I can name quickly: stable pointers between entities, entities are organized in hierarchies and the vector is just the “root” of a tree, the entity memory is usually from a pool to avoid new/delete and memory fragmentation, and lots of others.
Anyway, iterating through the list of entities in a vector of pointers to entities is the prime cause of cache issues in a game engine trying to push large entity counts. Making things worse is that in the CES style each entity will have a list of components it owns. So, your update loop is something like the following:
void update_entities(vector<EntityPtr>& entities) {
for (auto& entity : entities) {
// recurse down
update_entities(entity->children());
// update each component
for (auto& component : entity->components()) {
component->update();
}
}
}
This is where a lot of folks would start, hopefully they will optimize it but the point here is the memory access is random at many levels and thus problematic. As has been pointed out, CPU's are fast, really really fast, the ongoing problem is that the CPU performance to memory performance gap continually increases: image. At this point in time, memory access coherency is almost surely the #1 limiting factor in engines that simulate a lot of entities. Add more cores and things go exponential rapidly due to cache line invalidation across cores/chiplets/whatever.
Anyway, thinking in terms of AOS versus SOA at this point isn't going to help much, you have work to do before that comes into play. The first thing is to remove at least some of the random access from this update system. A common starting point is to “not” update the components within the above loop. The loop would simply run through the entity containers deleting any dead entities, inserting newly spawned entities, handling enable/disable flag changes etc. After that completes then the engine updates components per type in tight loops. Because the components are now updated per type, they can be stored in linear memory and iterated without the big cache misses. This puts some limitations on what can be done in components and when data is available but it's nice and consistent so folks quickly get used to it.
If you keep reducing random access like this by flattening the update you will eventually end up with something similar to an ECS. I literally made the step by step transition myself, remove more random access here and there until the resulting architecture was basically an ECS. Completing the transition would be converting the structures from AOS to SOA organization. To be clear, most games that use a CES style setup like the above would probably be fine without doing anything more than moving the components out to flat arrays. Most work is done in components and having them optimized is probably going to be enough for almost any type of game that is not trying to run thousands of entities. You can even multi-thread this to a degree either with shadow writes or by controlling which component updates run in parallel based on what data they read/write. Going to a full ECS thus may not be any great benefit for most people.
The place where fixing the update loop breaks down and the transition to SOA becomes more important is when you start adding AI to the game. Most entity AI requires the ability to reason about it's surroundings, i.e. looking at other entities. The simple case is for an enemy to look at the entities around it and figure out which one is closest. It needs to look at the positions of all the surrounding entities. Unfortunately spatial coherence and memory coherence rarely have anything to do with each other and as such, entity reasoning becomes the next big challenge for memory problems. Moving the entity data to SOA is not a great help for this but it does at least increase the chances of several entities of interest being in the same cache lines. There are other solutions that build on this but at this point it doesn't really matter if you are using unoptimized CES or full blown ECS, it's a completely different problem and the organizing architecture is of little help here.
Hopefully the pedantic breakdown starting at a higher level is helpful.