JoeJ said:
However, only now i notice brute force became faster than any grids with Clang. With a whopping 5000 objects. I will need days to swallow this down and deal with it… :O
Wuut? Are you sure that you didn't introduce some suble error in the algorithm for eigther case? I mean, it does find the same number of elements, but even I would not have expected this to happen for that many objects.
Alternatively, maybe its because with that many elements, the cells all become “satured”, and you end up doing O(N^2) for each for a number of elements that is too high comparatively? I'm not really sure how the cells are divided exactly, but root of 5000 is 70, is of we assume 8*8 cells, that would mean roughly 8-9 elements on each axis per cell, so 81, which would mean that even per cell you are doing about as many brute-force steps per cell than my game is doing in high-load scenarios ? Assuming that intra-cell comparisons are O(N^2), which I assume. So following that logic, brute force would be 12.500.000 “steps”, while 64-grid cells * (81*81) collisions would mean 420.000 by my napkin math, which is a difference of x25 “steps” for brute-force, which would still fall within the range that one could except the raw speed to outweight the complexity-difference.
On my own front, I did implement the collision-filtering, with somehow good results:
Running my test-sequence with the bossfight with the most projectiles, in debug execution-speed goes up from 35x to 72x. Nice! I don't have a release-variant for this single test, but for the whole project-tests, it went from 87s to 83s, which is a good improvement considering the length of game that is being run in that time. Sorry for the unusual performance-metrics, I could measure the actual time being spent inside the actual loop, but I do care about how it globally affects those test-speeds the most, this has a practical advantage for me ? Considering that those numbers do include everything that is being run with the game, those numbers do mean a lot actually.
EDIT: See next post for some corrections
The code is here: https://pastebin.com/gJRJ3M0V
There is a lot of engine-specific yadayda going on, but I hope the core of the algorithm is understandable. Notable things are:
- Iterating over the actual containers I have still done in O(N^2), which made ma chuckle a bit. But I am fairly certain that for an N of 5 as of now, it is still better, and easier to type than creating explicit list-of target types for this quick example ?
- There is some additional complexity in having to order collisions, which already existed before, which not every system might need. So this could be used to speed up things even more
- There a few things which potentially could be moved around to eigther improve, or make worse, the characteristics. Like certain properties could additionally be cached when moving the elements into their bins, or less, depending on the use-case. This does depend a bit on how often things actually collide, which brings me to my next point:
- This implementation is only for “touch”-triggers, meaning no physically blocking collision. Which also means that the actual % of overlaps here is very few each frame. The actual collision-system, at least for this new format, could also be updated somewhat easily, as I have certain guarantees (like than an entity will not physically be deleted until the end of frame). However, the system in its entirely is a bit more complex, so going beyond that with a grid there wouldn't be as easy. In short, the blocking-collisions are not handled just in one loop. For the specific movement-system I have, it is necessary to dynamically update potential locations of objects while they move, as well as potential collision-pairs, in a sense that cannot simply be resolved by pushing collisions out of their contact points.
I'll continue implementing this for blocking-collisions as well, and see what the final numbers say ?