Advertisement

How much is too much

Started by December 18, 2023 05:29 PM
19 comments, last by LorenzoGatti 11 months, 3 weeks ago

JoeJ was saying in a previous thread that when you have a RTS with more than 100 units you need to start thinking about optimizations when it comes collision detection. I’m making a calculus 100 x 100 units is 10.000 collision detection checks. You can cut that in half with well thought looping ( if you avoid to collide same units twice). Colliding two units means 12 comparisons between 2 ints. 5000 collisions means 60000 comparisons. So that is still acceptable. If you use a collision optimization tree each unit collides with up to 8 more units so that’s 800 collision detection checks.

I’m wondering if using Dijkstra is not too much for a 40 x 40 nodes map. Sorting 1600 nodes to pick a node to visit means 160000 comparisons for a web of 100 nodes visited.( that’s almost three times more than colliding 100 units) Am I way off with my numbers? I’m just trying to understand what is reasonable.

My project`s facebook page is “DreamLand Page”

Calin said:
using Dijkstra

Don't do that. Dijkstra's algorithm visits the entire navigation graph, which becomes very slow for large graphs. Use A-star instead, which will only visit the parts of the graph that are relevant to the current path being planned.

Advertisement

You cannot have a single number as “this is too much”. To a large extent it all depends on how much time you have to do the job.

If you do a play-by-mail game, then CPU time isn't critical at all, and pretty much anything is acceptable. With an RTS, likely you want several things done each frame and your time is more limited. However even within the RTS genre, it depends. If you want smart bad guys, you need relatively much CPU time to compute their next move, and thus you have less time for other parts. If you have no AI at all (just PvP), you have more time since you don't need to compute an AI at all.

Calin said:
JoeJ was saying in a previous thread that when you have a RTS with more than 100 units you need to start thinking about optimizations when it comes collision detection. I’m making a calculus 100 x 100 units is 10.000 collision detection checks. You can cut that in half with well thought looping ( if you avoid to collide same units twice). Colliding two units means 12 comparisons between 2 ints. 5000 collisions means 60000 comparisons. So that is still acceptable. If you use a collision optimization tree each unit collides with up to 8 more units so that’s 800 collision detection checks.

You count collision checks, but then you multiply this by a constant of n comparisons. Notice those comparisons are irrelevant, because they are just a constant factor applied to both options equally, the brute force approach, and the alternative of using some acceleration structure.

More importantly, to see what actually matters, we need to observe how costs of our algorithms change, as N changes.
In your example you pick some number which you assume to be practical, but you should think of a table like this:

N		brute force	 acceleration structure
10		100			10*8
100		10000		100*8
1000	1000000		1000*8

This is the number of collision checks, roughly as you have listed them, but for 3 examples of N (number of units).

Likely you knew it already, but it shows how quickly the brute force cost grows, actually O(N^2).

Contrary, the cost with the AS increases linearly, just like N itself, so O(N).

Notice we ignore some things here: We can half the number for BF; building and looking up / traversing has some cost too; nobody knows where you have pulled out that number of 8 out from.
But nothing of this matters, because it's ‘constant costs’. Similar to your redundant information of integer comparisons, which also is a constant cost.
To get the time complexity of an algorithm, we go a step further: I will now remove the 8 from the table, because it's a constant factor to the cost of the entire column / algorithm. And i have already skipped the division by 2 on the brute force column anyway.

Now you may ask: Is this fair and correct? How can we compare two algorithms costs if we remove factors affecting those costs?

The answer is: We do not compare the algorithms to see which is faster for a given number of N. It looks like that, but that's not what we do. If we want that, we would do something different, more on this soon.
No, we only want to see how the costs of any algorithm change as the size of the problem changes. Because this is all we can generally say about the costs of some algorithm, besides memory requirements.

To find the real costs you actually ask for, you simply try it out.
Run those examples, using your code and your hardware, and measure the time this and that takes.
You need a simple way to measure time accurately. I use this outdated and ugly code. (Modern C++ has std::chrono now, so this should be easier now, but actually isn't. At least it's cross platform.)

#include <mutex>
#include <algorithm>
#include <stdio.h>
#include <windows.h>
#include <direct.h>
#include <Mmsystem.h>

#include "SystemTools.h"


uint64_t SystemTools::GetTimeNS ()
{
	static LARGE_INTEGER ticksPerSec;
	static bool initialized = false;

	if (!initialized) 
	{
		QueryPerformanceFrequency(&ticksPerSec); 
		initialized = true;
	}
	LARGE_INTEGER now;
	QueryPerformanceCounter(&now);
	return (uint64_t) ((1e9 * now.QuadPart) / ticksPerSec.QuadPart);
}

std::string SystemTools::GetDurationStringAndRestart(uint64_t& time)
{
	uint64_t time2 = GetTimeNS ();
	uint64_t diff = time2 - time;
	time = time2;
	int ns		= int(diff / 1000);
	int ms		= int(diff / 1000000);
	int seconds = int(diff / 1000000000);
	int minutes = seconds / 60;
	std::string str = "";
	if (minutes >= 1) str += std::to_string(minutes) + " min. ";
	seconds %= 60;
	if (seconds >= 1) str += std::to_string(seconds % 60) + " sec. ";
	if (seconds < 10) str += std::to_string(ms % 1000) + " ms. ";
	if (seconds < 1) str += std::to_string(ns % 1000) + " ns. ";
	return str;
}

Then i do something like this to measure runtime:

uint64_t time = SystemTools::GetTimeNS();
DoCollisionTestWithLotsOfUnits();
ImGui::Text("Collision test done after %s\n", SystemTools::GetDurationStringAndRestart(time).c_str());


That's what you need to do. Measure the time to see if using a simple grid for acceleration structure is faster than brute force.
It's the only way to figure out that magic number of ‘100 units’ this JoeJ guy has claimed, although it can be actually anything.
(The number of 100 was a rough guess taken from the 2D collisions demo i did here somewhere, but i could not really measure it. Runtime is too short and measurement is too noisy. For realtime workloads you need to make an average of many measurements, which i did not do.)

Calin said:
I’m wondering if using Dijkstra is not too much for a 40 x 40 nodes map. Sorting 1600 nodes to pick a node to visit means 160000 comparisons for a web of 100 nodes visited.( that’s almost three times more than colliding 100 units) Am I way off with my numbers? I’m just trying to understand what is reasonable.

I'm confused. But i would say Dijkstra visits each node only once at most, so the cost for a 40x40 map is (≥ 1600) times something you have to measure as well.

Regarding A-star, yes it's faster for ‘one source, one target’ scenarios. Much faster if obstacles are sparse, not faster if we are in some labyrinth.
For ‘one source, many targets’ Dijsktra is still efficient. But idk if that's relevant for games.

JoeJ said:
For ‘one source, many targets’ Dijsktra is still efficient. But idk if that's relevant for games.

What is one source, many targets?

If you select a group of units (in sense of AoE or such game), it really is many source, many target. Creating GOOD path is somewhat harder than just creating a path, because you have to avoid inter-unit collisions while moving the whole group over the map.

I'll avoid some implementation today (been way too busy lately) and point out this link which shows some animations of problems and their possible solutions - https://www.gamedeveloper.com/programming/group-pathfinding-movement-in-rts-style-games

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

JoeJ said:
For ‘one source, many targets’ Dijsktra is still efficient.

I am guessing this means one NPC going for the nearest target (eg a worker finding the nearest tree to cut).

You can do that with A* too with a trick.

To understand, consider the normal A* scenario: one source one target first. You repeatedly expand the path by picking the most promising solution that you have, and replace it by new paths to the neighbors of the picked solution. As such, you pretty much always have multiple paths “in progress” while running the algorithm, This makes the extension to find the optimal starting point from several possible starting points to one target trivial. Just insert all sources as initial solution instead of just one source.

One source many targets is the same as many sources and one target, if you explore the path in reverse. You start at all target points and find a path walking “backwards” to the single source. If path walking is symmetrical (ie both directions between two neighboring points have equal cost) this is quite trivial. If costs are not symmetrical, they must be swapped since the algorithm runs backwards.

Advertisement

Alberth said:
To understand, consider the normal A* scenario: one source one target first. You repeatedly expand the path by picking the most promising solution that you have, and replace it by new paths to the neighbors of the picked solution. As such, you pretty much always have multiple paths “in progress” while running the algorithm, This makes the extension to find the optimal starting point from several possible starting points to one target trivial. Just insert all sources as initial solution instead of just one source.

I don't understand, but what i meant is much simpler i guess.

Say you pick one random source vertex on the mesh, and 10 random target vertices.
You start at the source, and keep Dijkstra running until all 10 targets have been found. No changes or extensions to the algorithm are needed, and all 10 paths will be the shortest possible as expected.

I guess with A star this can't work, because the heuristic involves a target.
Dijkstra dos not care about that, it only expands a tree of shortest paths from the source, visiting every vertex of the graph anyway, so we already pay the cost to find all shortest paths even if we need only one. We may cut that cost by terminating early when reaching our target(s), but if we keep running we find all shortest paths from the source.

For games this would be useful if you want to find the closest target out of many options.
But if that's 4 nearby targets for example, and each can be found by a straight path, running A star 4 times is likely still faster.

Personally i have used this in geometry processing. There are simple algorithms where a combination of spanning tree and shortest paths can cut a mesh into a set of patches, each topologically equivalent to a disk. This is how you generate UV charts automatically.
We can also improve the quality of the cuts by setting edge cost from curvature. This way we get nice chart boundaries. If i run it on a tessellated cube, the cuts will be on the edges, and you get the typical cross (or similar) shape of 6 connected quadratic regions, like a human artist would do.
It's really interesting how graph algorithms can solve such difficult problems.

Sometimes you need to order the paths exiting a vertex in clockwise order. Then it is not allowed that paths cross each other. The guarantee that shortest paths can't cross is essential here, and it's also a situation where many targets at once is useful.

JoeJ said:
nobody knows where you have pulled out that number of 8 out from.

Oh, i think i got it: Your game is quantized to the grid, and every grid cell can contain only one unit. so the 3x3 neighborhood gives 8 potential other units nearby.

But then you don't need collision detection or resolution at all, because units are not allowed to collide. : )

Edit: This means you already use a regular grid acceleration structure. You can find nearby units in constant time, so the cost does not increase faster than N, and you can scale up to more units without worries about exploding costs.
No reason for performance worries, but you should use A star instead Dijkstra almost certainly.

And for clarity, A* is an optimization on Dijkstra's single target algorithm, with the addition of an optimization heuristic. The addition uses the map coordinate to prioritize the shortest distance map coordinate first. If you set the heuristic value to a constant (like 1) then it removes the optimization and reverts to the root shortest path algorithm.

The heuristic usually reduces the search space tremendously, but depending on heuristic implementation, can miss the optimal value due to a local minima. He taught about it and others in his courses, an optimization he didn't notice up front.

@Calin It's insightful to consider optimization strategies for RTS games with over 100 units. Your calculation demonstrates the impact of collision detection, highlighting the importance of efficient looping. Utilizing a collision optimization tree significantly reduces checks, enhancing performance and maintaining playability with 10,000 units. Great strategic thinking!


JoeJ: every grid cell can contain only one unit

yes, that’s the approach I’m using

JoeJ: But then you don’t need collision detection

I was about to ask about that.

The problem of using a timer is that you get different results on different computers. All I’m saying is that it’s good to stay alert when you’re repeating an operation tens of thousands of times at one time.

frob: And for clarity, A* is an optimisation

Thanks, I find interesting what you and Alberth are talking about. Hopefully I will write my own implementation of A star at some point. For now I will keep using Dijkstra, it’s bug free hence I can focus on other areas.

My project`s facebook page is “DreamLand Page”

This topic is closed to new replies.

Advertisement