A* optimisations?

d000hg · 2005-10-09T22:01:10

I've implemented a first try at A* for better path-finding in a tile-based world. Preliminary tests seems to show that it works, but is very slow - for a path of ~100 tiles it took ~0.25s (on a 750MHz processor) which is unacceptable for real-time use, even if the calculation is spread over several frames. I reckon i need to boost performance by an order of magnitude, which may be possible since this is a straight-from -the-book implementation with no performance considerations. Here is the method which calculates the path. Note that m_Open & m_Closed are std::lists. It's not too long or complex! bool PathFinder::CalculatePath3(RTSUnit *pUnit,WORD destX,WORD destY) { //if this unit is a user of any other paths, unregister it from them Cancel(pUnit); WORD startX = pUnit->Pos().x, startY = pUnit->Pos().y; m_Open.clear(); m_Closed.clear(); PathSearchNode N,Nnew,*pN; m_Open.push_front(PathSearchNode(0,startX,startY,destX,destY)); while(!m_Open.empty()) { N = m_Open.front(); m_Open.pop_front(); if(N.m_X == destX && N.m_Y == destY) { //found a route - genreate the list of tiles PathDataTileList *pPath = new PathDataTileList(this); for(pN = &N; pN; pN=pN->m_pParent) { pPath->AddPoint(pN->m_X,pN->m_Y); } m_Paths.push_back(pPath); m_PathLocations.push_back(new PathLocation(pPath,pUnit)); return true; } //push N onto closed and remember it's address for setting it as a parent to the new nodes m_Closed.push_back(N); pN = &m_Closed.back(); //find the 8 surrounding nodes for(WORD y = max(N.m_Y,1)-1; y<= N.m_Y+1; ++y) for(WORD x = max(N.m_X,1)-1; x<= N.m_X+1; ++x) if(y!=N.m_Y || x!=N.m_X) //don't select current node again! { Nnew.Set(pN,x,y,destX,destY); bool skip =false; // check - if Nnew in closed or open, and existing version has lower G, then skip this one for(std::list<PathSearchNode>::iterator Iopen = m_Open.begin(); Iopen!= m_Open.end(); ++Iopen) if((*Iopen) == Nnew && (*Iopen).m_G <= Nnew.m_G) { skip =true; break; } for(std::list<PathSearchNode>::iterator Iclosed = m_Closed.begin(); Iclosed!= m_Closed.end(); ++Iclosed) if((*Iclosed) == Nnew && (*Iclosed).m_G <= Nnew.m_G) { skip =true; break; } if(skip) continue; //if in closed, remove it if(Iclosed != m_Closed.end()) m_Closed.erase(Iclosed); //add to open if(Iopen == m_Open.end()) m_Open.push_front(Nnew); } //sort all nodes by ascending F. m_Open.sort(); } return false; } For reference, here is the node class I am using: class PathSearchNode { public: PathSearchNode *m_pParent; float m_F,m_G,m_H; //G is cost to this node, H heuristic estimate from here to end, F node cost G+H WORD m_X,m_Y; //the tile this node is located at PathSearchNode(){} PathSearchNode(PathSearchNode *pParent,WORD X,WORD Y,WORD destX,WORD destY) { Set(pParent,X,Y,destX,destY); } void Set(PathSearchNode *pParent,WORD X,WORD Y,WORD destX,WORD destY) { m_pParent = pParent; m_X = X; m_Y = Y; //no parent - start point if(!m_pParent) m_G=0; else if(m_pParent->m_X != m_X && m_pParent->m_Y != m_Y) m_G = m_pParent->m_G + 1.4f; //extra cost for diagonal movement else m_G = m_pParent->m_G + 1.0f; m_H = EstimateGoalDistance(destX,destY); m_F = m_G + m_H; } bool operator<(const PathSearchNode &comp) { return (m_F < comp.m_F); } bool operator>(const PathSearchNode &comp) { return (m_F > comp.m_F); } bool operator==(const PathSearchNode &comp) { return (m_X==comp.m_X && m_Y==comp.m_Y); } int EstimateGoalDistance(WORD destX,WORD destY) { int dx = max(m_X,destX) - min(m_X,destX); int dy = max(m_Y,destY) - min(m_Y,destY); return max(dx,dy); } }; To my mind, the fact that I keep iterating through the open/closed lists is probably a bad thing. Also, I am just pushing/popping from the Open list and calling std::list::sort on it. Recursion can be used to eliminate the open and closed lists completely, but a) can the stack take this and b)that means I can't spread the path-finding calculations over a second or two. Any advice would be gratefully received.

Artificial Intelligence Programming

Started by d000hg September 26, 2005 04:38 AM

46 comments, last by Zoma 19 years, 4 months ago

snk_kid

1,312

September 26, 2005 07:34 AM

Quote:
Original post by DrEvil
If I remember right I read that lower_bound basically ended up doing a slow sort through the entire vector or something.

It doesn't do any explicit sorting, its a version of binary search algorithm, it returns the first position where value could be inserted without violating the ordering.

Yes your right using a custom allocator type can in some cases dramatically increase preformance, i've got some ridiculous high preformance increase using std::deque with boost::fast_pool_allocator

d000hg

Author

1,208

September 26, 2005 07:39 AM

PathFinder::CalculatePath3: path-length = 95,|open|=732, |closed|=8610

This is the state after running this over a reasonable part of the map. It took about 15seconds on a 3GHz PIV!! I don't really want to get into more advanced STL (I know map,vector,list,string basically) since I just don't have time. It looks like for tweak-level gains I need to, but for just getting it reasonable aren't all container types fast enough?

Oh and I should add this is a debug build. It doesn't normally make any great difference to the speed of my game, but there are places where debug is a lot slower.

DrEvil

1,151

September 26, 2005 07:52 AM

As with any STL usage, you have to choose the right container for the job. It doesn't take advanced STL for good results. That's the beauty of STL, its there to use, and its efficient. The most important part is picking the right container, and in this case, the right sorting methods.

Edit: debug build performance evaluations are worthless, especially using STL. Always always always use optimized release builds when evaluating performance, or however your game will end up 'shipping' IMO the only downside to STL is the large slowdowns it has in debug builds in many cases.

Omni-Bot: My game/mod independant AI bot framework.

snk_kid

1,312

September 26, 2005 08:01 AM

Quote:
Original post by d000hg
It looks like for tweak-level gains I need to, but for just getting it reasonable aren't all container types fast enough?

On modern C++ compilers all standard library containers are written to be as efficient as possible and take advantage of optimizations, ignore AP's generally false statement. Although this is the case it does not mean that all containers are equally fast at doing a particular operation and this is has nothing to do with STL this is has to do with the data structure used, each container will be better/faster at some operations than others this is where you choose the right container for the job. The choice generally will depend on what are the significant/important/mostly used operations to the context, you check the documentation for time/space complexities for each relevant operation each container will have different time/space complexities.

Quote:
Original post by d000hg
Oh and I should add this is a debug build. It doesn't normally make any great difference to the speed of my game, but there are places where debug is a lot slower.

Don't profile debug builds especially for standard library container & algorithms they heavily rely and take advantage of compiler optimizations.

d000hg

Author

1,208

September 26, 2005 08:12 AM

In you guys' experience then, list is crap? Vector is better, and map too? I don't have docs for hash_map on VS6. lower_bound is faster than iterating through a container myself to find where to do an insert, or I cn use push_heap in a similar fashion?

Looks like I'm down to about 250ms for a path of 70 'moves', still passing by value to the containers. Maybe changing to pointers will help too, or is that insignificant compared to using a different sort/insert method?

DrEvil

1,151

September 26, 2005 08:14 AM

Like AP said, you don't need a node object. My waypoint object has members for the given cost/final cost/etc... and my A* planner uses pointers to them. Works well, but a downside to using your tiles or your waypoints as your nodes is that you can't have multiple path queries running in 'parallel'.

Omni-Bot: My game/mod independant AI bot framework.

snk_kid

1,312

September 26, 2005 08:24 AM

Quote:
Original post by d000hg
In you guys' experience then, list is crap? Vector is better, and map too?

You can't say list is just crap in general, it depends on the operations you are going to do in some contexts list will be the best for what you are doing in other contexts it will be some other container.

Quote:
Original post by d000hg
I don't have docs for hash_map on VS6.

I seriously suggest using a different or much newer MS C++ compiler (like VC++ 7.1 or 8.0), VC++ 6.0 has poor standard C++ compilancey and a not so good implementation of STL. VC++ 7.1 is free to download and you can even hookit up to VC++ 6.0 IDE if you want. Remember hash_map is standard library extension so in VC++ 7.1 std::hash_map is deprecated for stdext::hash_map the docs for hash_map is here.

Quote:
Original post by d000hg
lower_bound is faster than iterating through a container myself to find where to do an insert, or I cn use push_heap in a similar fashion?

I don't have any personal exprinice of A* as some people have been mentioned it's probably best to go with a binary heap and use standard library heap operations which means using either std::vector or std::deque or C-style array instead.

[Edited by - snk_kid on September 26, 2005 9:24:11 AM]

d000hg

Author

1,208

September 26, 2005 08:28 AM

Quote:
Original post by Anonymous Poster
You don't need node objects at all, you can use map tiles as the nodes instead. There's no need to iterate through the open and closed lists to discover if a node is open or not, you can just find the node directly using its coordinates and set a flag on it indicating what its status is.

Surely this ony works in a recursive function. If you want to maintain the state of the search part-way through, how do you do this?

I didn't know I could use the new compilers with the VS6 IDE. I may look into that.

intrest86

742

September 26, 2005 09:17 AM

As AP mentioned, dynamic allocations are going to cause some slow down here. There are a few news in there, and a memory pool could take care of them with a lot less overhead. Take a look at the boost custom allocators, I think there is one to handle this.

There are also boost allocators to pool memory for STL data structures, which will get rid of a lot of hidden memory allocations like expanding the lists, etc. I experienced a big speed boost when I switched over to using them, and they are pretty easy to just plug in.

A* should be vastly faster than what you are running right now. A year ago I had an A* algo that worked on a graph where every node was connected to 16 other nodes, and the entire map could be measured in several thousands of positions. I can't remember the exact timing, but it could be run several times in the 30th of a second I had to work with.

Turring Machines are better than C++ any day ^_~

d000hg

Author

1,208

September 26, 2005 09:44 AM

AP: ah, I understand better (assuming you are the same AP as earlier?). Basically allocate a node for every tile. This seems a bit over the top though - quite memory wasteful. Surely just storing a big pool of node objects in a used and unused list would be just as good, with far fewer objects needed. Although if you can have waypoints then a user might create a path covering most of the map.
Can this system cover a path with waypoints where you go through the same tile multiple times?

A* optimisations?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

A* optimisations?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines