Advertisement

Out of ideas for optimizing A* search, got any?

Started by September 08, 2017 09:00 PM
23 comments, last by ferrous 7 years, 2 months ago

Here are the two main files:

Path: https://pastebin.com/ZkrbJ78t

AStar: https://pastebin.com/EVGazFtU

 

A few things I've done to optimize the search:

1. Rather than a dictionary to keep track of scores and parents, I use a 2d array of objects that are mapped to the grid i.e. [x,y] represents grid point (x,y).  Objects have F,G,H and Parent fields, these fields are reset to default values during each search.

2. Cheap weighted heuristic

3. Orthogonal neighbors (searching 4 neighbors rather than 8)

Paths don't have to be beautiful because the point reduction smooths them out quite nice. 

In Path.cs you can see the two functions to reduce points, Incremental and Stretch.  Stretch is much faster because it has the potential to reduce points all the way to the end right off the bat or early on thus reducing calls to the LineOfSight function, but it isn't so reliable around small areas/corners.  The incremental function is reliable but it calls to the LineOfSight function for each and every point along the path, this causes the path reduction to be just about as slow as calculating the path itself. 

Line of Sight is determined by getting all points between 2 points and checking that they are all flagged as Walkable.

 

My grid is dynamic and roughly 250x250.  I tried out Theta* and JPS, both were slower than my current A*.  I'm trying to avoid using a navmesh or worker thread, I suppose mostly just looking for ideas to tighten up my code :)

Have you run a profiler against your code?

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Advertisement
50 minutes ago, ApochPiQ said:

Have you run a profiler against your code?

I have.  The two biggest bottlenecks are in point reduction and the priority queue (using BlueRaja's which claims to be the fastest)

Here's a screenshot from the Unity profiler: f43ea1392f.png

Profiling alone adds a lot of overhead but I think it gives a good idea of where things sit.  I've never used a profile other than Unity's

Actually, the profiler is showing you something slightly different.

Keep in mind that what you care about in this context is not the sum total cost of every call to a function - just the average cost of a single call. If you divide the time spent in a function by its call count, you'll see that the priority queue is dirt cheap per-call, you just call it a whole lot.

This is important data on two fronts. First, you should focus your effort on your algorithm primarily. Look for ways to avoid calling code that has no effect on the final results of the path search. Second, if you look at the actual places your code is burning a lot of time, you'll get much more visible performance wins.

 

Unity has a "deep" profiler option that you should also look into.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I took a look at some of the code:

- Heuristic.  Is distance-squared admissible for A*?  I've never seen it used for A* before and that is a warning sign to me.

- FastPriorityQueue is being called an extremely high number of times, but each operation should be EXTREMELY short, so the profiler overhead is likely to be polluting the numbers to the point where you should no longer trust them.  You'll have to move your A* code into a standalone .Net project in order to use a sampling profiler with it, as far as I know.

- I looked at the github repo for FastPriorityQueue, and there are some obvious microoptimizations that can be made to it if you determine that it really is worth it.  Change the Node to a struct, change those properties to plain old fields, perhaps change the binary heap to a cache-friendly heap (after the first few layers of a binary heap you will cache miss on every single layer).

- JPS+ should be tremendously faster than A*, so if it wasn't then there's something significantly wrong with the implementation that you tried.

I might be wrong here, but it looks like you might be enqueueing more nodes than you really ought to need to, unless you're hitting a really pathological layout.  250x250 is 62500 total nodes, and your profiler result is showing 59600 calls to Enqueue(), which appears to indicate that it is considering a pretty hefty fraction of the total nodes available.  Assuming no obstacles and an appropriate heuristic, A* ought to pretty much go in a straight line from the start to the goal, and only have to put the neighboring nodes of the ideal path on the open (priority queue) set.

You might want to try tweaking your code so that you can step through the algorithm one iteration at a time, and visualize how nodes are getting added to your open and closed sets at each step, so you can see if it is clearly doing something weird.  Five seconds is really far too long for this to take.

Where your connections all appear to be orthogonal, you might want to look at something like Manhattan Distance as a heuristic, which tends to work very well with A* in such situations.  That would tend to look something like this, lifted from my pathfinding code:


public static float ManhattanDistance(Point start, Point goal) {
  var dx = Math.Abs(start.X - goal.X);
  var dy = Math.Abs(start.Y - goal.Y);
  var h = dx + dy;

  return h;
}

 

Eric Richards

SlimDX tutorials - http://www.richardssoftware.net/

Twitter - @EricRichards22

Advertisement
1 hour ago, Nypyren said:

I took a look at some of the code:

- Heuristic.  Is distance-squared admissible for A*?  I've never seen it used for A* before and that is a warning sign to me.

- FastPriorityQueue is being called an extremely high number of times, so the profiler overhead is likely to be polluting the numbers to the point where you should no longer trust them.  You'll have to move your A* code into a standalone .Net project in order to use a sampling profiler with it, as far as I know.

- I looked at the github repo for FastPriorityQueue, and there are some obvious microoptimizations that can be made to it if you determine that it really is worth it.  Change the Node to a struct, change those properties to plain old fields, perhaps change the binary heap to a cache-friendly heap (after the first few layers of a binary heap you will cache miss on every single layer).

- JPS+ should be tremendously faster than A*, so if it wasn't then there's something significantly wrong with the implementation that you tried.

I've played around with a variety of heuristics, this is the fastest I've come across and it produces acceptable results.  I'll give JPS another try, it's very possibly my implementation is messed up.

 

1 hour ago, ericrrichards22 said:

I might be wrong here, but it looks like you might be enqueueing more nodes than you really ought to need to, unless you're hitting a really pathological layout.  250x250 is 62500 total nodes, and your profiler result is showing 59600 calls to Enqueue(), which appears to indicate that it is considering a pretty hefty fraction of the total nodes available.  Assuming no obstacles and an appropriate heuristic, A* ought to pretty much go in a straight line from the start to the goal, and only have to put the neighboring nodes of the ideal path on the open (priority queue) set.

You might want to try tweaking your code so that you can step through the algorithm one iteration at a time, and visualize how nodes are getting added to your open and closed sets at each step, so you can see if it is clearly doing something weird.  Five seconds is really far too long for this to take.

Where your connections all appear to be orthogonal, you might want to look at something like Manhattan Distance as a heuristic, which tends to work very well with A* in such situations.  That would tend to look something like this, lifted from my pathfinding code:



public static float ManhattanDistance(Point start, Point goal) {
  var dx = Math.Abs(start.X - goal.X);
  var dy = Math.Abs(start.Y - goal.Y);
  var h = dx + dy;

  return h;
}

 

Ah I didn't mention that profiler screenshot is the result of 100 paths in a single frame.  Here's a better shot for a single path: 09008e6e1f.png

 

I fixed up some code and gave Manhattan distance a test run.  Here's a  comparison between my current heuristic & Manhattan

aba18466d2.jpg

6c1bbae0a0.jpg

 

Manhattan without weight puts it at about 2,200ms.  While Manhattan makes for a better looking path, post-processing takes care of that so I'm mostly interested in speed :)

Use an instance of the Stopwatch class to time SetDestination with and without deep profiling enabled, and see how much it differs.  I still think the instrumenting profiler is causing too much overhead on the FastPriorityQueue and it would be good to find out how much there is.  It's the biggest offender if the profiler is telling the truth.

3 minutes ago, Nypyren said:

Use an instance of the Stopwatch class to time SetDestination with and without deep profiling enabled, and see how much it differs.  I still think the instrumenting profiler is causing too much overhead on the FastPriorityQueue and it would be good to find out how much there is.  It's the biggest offender if the profiler is telling the truth.

I have my debug script set up just like that :)

179f97c661.png

 

The results in the two screenshots above are from this Stopwatch snippet with deep profile & profiler disabled completely

Ok, so with profiling one call takes 31ms, and without you have 1-2ms.  Definitely too much overhead from Unity's instrumenting profiler.

If you can, take all of your pathfinding code and make a standalone .Net app out of it, and use a sampling profiler on it.  I think newer versions of Visual Studio should have a profiler built into them.  I can't tell if you're on Windows or not from the screenshots, though.  If not, there should also be some freely available profilers available that will run on other OSes.

This topic is closed to new replies.

Advertisement