A* optimisations?

Author

1,208

September 26, 2005 04:38 AM

I've implemented a first try at A* for better path-finding in a tile-based world. Preliminary tests seems to show that it works, but is very slow - for a path of ~100 tiles it took ~0.25s (on a 750MHz processor) which is unacceptable for real-time use, even if the calculation is spread over several frames. I reckon i need to boost performance by an order of magnitude, which may be possible since this is a straight-from -the-book implementation with no performance considerations. Here is the method which calculates the path. Note that m_Open & m_Closed are std::lists. It's not too long or complex!

bool PathFinder::CalculatePath3(RTSUnit *pUnit,WORD destX,WORD destY)
{
	//if this unit is a user of any other paths, unregister it from them
	Cancel(pUnit);

	WORD startX = pUnit->Pos().x, startY = pUnit->Pos().y;
	m_Open.clear();
	m_Closed.clear();
	
	PathSearchNode N,Nnew,*pN;
	
	m_Open.push_front(PathSearchNode(0,startX,startY,destX,destY));
	while(!m_Open.empty())
	{
		N = m_Open.front();
		m_Open.pop_front();
		if(N.m_X == destX && N.m_Y == destY)
		{
			//found a route - genreate the list of tiles
			PathDataTileList *pPath = new PathDataTileList(this);
			
			for(pN = &N; pN; pN=pN->m_pParent) 
			{
				pPath->AddPoint(pN->m_X,pN->m_Y);
			}
			m_Paths.push_back(pPath);
			m_PathLocations.push_back(new PathLocation(pPath,pUnit));
			return true;
		}

		//push N onto closed and remember it's address for setting it as a parent to the new nodes
		m_Closed.push_back(N);
		pN = &m_Closed.back();

		//find the 8 surrounding nodes
		for(WORD y = max(N.m_Y,1)-1; y<= N.m_Y+1; ++y)
		for(WORD x = max(N.m_X,1)-1; x<= N.m_X+1; ++x)
		if(y!=N.m_Y || x!=N.m_X)	//don't select current node again!
		{
			Nnew.Set(pN,x,y,destX,destY);
			
			bool skip =false;
			// check - if Nnew in closed or open, and existing version has lower G, then skip this one
			for(std::list<PathSearchNode>::iterator Iopen = m_Open.begin(); Iopen!= m_Open.end(); ++Iopen)
				if((*Iopen) == Nnew && (*Iopen).m_G <= Nnew.m_G)
				{	skip =true;		break;	}
			for(std::list<PathSearchNode>::iterator Iclosed = m_Closed.begin(); Iclosed!= m_Closed.end(); ++Iclosed)
				if((*Iclosed) == Nnew && (*Iclosed).m_G <= Nnew.m_G)
				{	skip =true;		break;	}
			if(skip)
				continue;

			//if in closed, remove it
			if(Iclosed != m_Closed.end())
				m_Closed.erase(Iclosed);

			//add to open
			if(Iopen == m_Open.end())
				m_Open.push_front(Nnew);
		}
		//sort all nodes by ascending F.
		m_Open.sort();
	}
	return false;
}

For reference, here is the node class I am using:

class PathSearchNode
{
public:
	PathSearchNode *m_pParent;
	float m_F,m_G,m_H;	//G is cost to this node, H heuristic estimate from here to end, F node cost G+H
	WORD m_X,m_Y;		//the tile this node is located at

	PathSearchNode(){}
	PathSearchNode(PathSearchNode *pParent,WORD X,WORD Y,WORD destX,WORD destY)
	{
		Set(pParent,X,Y,destX,destY);
	}
	void Set(PathSearchNode *pParent,WORD X,WORD Y,WORD destX,WORD destY)
	{
		m_pParent = pParent;
		m_X = X;	m_Y = Y;
		//no parent - start point
		if(!m_pParent)
			m_G=0;
		else if(m_pParent->m_X != m_X && m_pParent->m_Y != m_Y)
			m_G = m_pParent->m_G + 1.4f;	//extra cost for diagonal movement
		else
			m_G = m_pParent->m_G + 1.0f;
		m_H = EstimateGoalDistance(destX,destY);
		m_F = m_G + m_H;
	}

	bool operator<(const PathSearchNode &comp)
	{	return (m_F < comp.m_F);	}
	bool operator>(const PathSearchNode &comp)
	{	return (m_F > comp.m_F);	}
	bool operator==(const PathSearchNode &comp)
	{	return (m_X==comp.m_X && m_Y==comp.m_Y);	}

	int EstimateGoalDistance(WORD destX,WORD destY)
	{
		int dx = max(m_X,destX) - min(m_X,destX);
		int dy = max(m_Y,destY) - min(m_Y,destY);
		return max(dx,dy);
	}
};

To my mind, the fact that I keep iterating through the open/closed lists is probably a bad thing. Also, I am just pushing/popping from the Open list and calling std::list::sort on it. Recursion can be used to eliminate the open and closed lists completely, but a) can the stack take this and b)that means I can't spread the path-finding calculations over a second or two. Any advice would be gratefully received.

Kylotan

10,513

September 26, 2005 05:28 AM

Most important suggestion: don't sort the open list every time. I bet that is where 90% of your processing time is going. Sorting is slow - sorting a list is very slow. Sorting is almost always much slower than the iteration you were worried about. So instead of adding to the front and then sorting the list, just iterate through the open list and find the correct place to insert each new value. In other words, keep the list sorted as you go.

Second suggestion: make your closed list an std::map, with the key being the x,y pair. That should give you much faster lookup when you're searching for the nodes.

B_Knight

100

September 26, 2005 05:34 AM

in "Game Developer Magaazine" (www.gdmag.com) issues 02-07/2005 you will find a loooooooooong article on A* optimization.

We'll fight till we win, or we'll fight util we die !!!

snk_kid

1,312

September 26, 2005 06:18 AM

Quote: Original post by Kylotan
Most important suggestion: don't sort the open list every time. I bet that is where 90% of your processing time is going. Sorting is slow - sorting a list is very slow. Sorting is almost always much slower than the iteration you were worried about. So instead of adding to the front and then sorting the list, just iterate through the open list and find the correct place to insert each new value. In other words, keep the list sorted as you go.

Yeah i'll give a hint check out std::lower_bound for when you add a new element.

Anonymous

September 26, 2005 06:25 AM

What you really want to do is use a binary heap for storing your open and close lists... Also, make sure you store pointers to your nodes/cells.
STL incurs lots of hidden overheads, and while the templates are very flexible, they are never as good as a specialized implementation.
I have got A* working well on a PS2 with a large number of units requesting paths, by time-slicing my calculations.

d000hg

Author

1,208

September 26, 2005 06:30 AM

Thanks guys. My docs don't say what search stl::list::sort() uses - which isn't too helpful! Maybe it sorts everything from scratch, or maybe a partially sorted list is much quicker to sort?

I'll modify it to insert directly into the open list next, so no sorting is required. Is there anything else I can look at - is my algorithm doing something dumb? I guess that 2D loop can be optimised out simply, but probably that's a small difference at best.

The thing is, most of the time there should be no collisions so it seems a waste do do A* (which generates about 5 new nodes per step) when mostly it's just taking the shortest route anyway! I suppose in theory I could do it first at a lower detail - work on big 2x2 or 4x4 'multitiles;, only looking on a single tile level where collisions occur. Is this or other LOD used in conjuction with A* commonly?

Why would Map be faster than iterating through elements - isn't this what Map::find does?

d000hg

Author

1,208

September 26, 2005 06:33 AM

Quote: Original post by Anonymous Poster
What you really want to do is use a binary heap for storing your open and close lists... Also, make sure you store pointers to your nodes/cells.
STL incurs lots of hidden overheads, and while the templates are very flexible, they are never as good as a specialized implementation.
I have got A* working well on a PS2 with a large number of units requesting paths, by time-slicing my calculations.

I'm sure the STL Vs roll-yer-own argument has shown us time and time again that STL is quite adequate on the x86 platform. A* is a standard graphing problem so why would STL be slow?
I'm not sure if using pointers helps me, although I probably should use new anyway and then store all the nodes for re-use.

Anonymous

September 26, 2005 06:36 AM

If the dimensions of your world are fixed, then you shouldnt really be creating new instances per step...
Your world cells should really exist in a 2D array, where each item contains data such as if it is walkable space or not etc,,...
Each cell should also contain data on costs etc... or you could have another 2D array which matches the world cell array, to store the costs weightings...

Your open and close lists should contain pointers to these cells as your A* algorithm computes the path, so your are not sorting large chunks of data, you are just swapping pointers. And, like I said earlier, use BinaryHeaps to store the open and close lists as they have quick insertion and O(1) removal of the cheapest item. They can also be optimized for relatively fast membership tests etc,...

Anonymous

September 26, 2005 06:42 AM

STL is slow in comparison to an optimized specific implementation.
Also, swapping data structures as opposed to pointers is HUGELY SLOW.
A simple 256X256 cell world has 65536 cells, and in situations where your search expands in all directions (such as in dead end "U" zones), the amount of data swapping becomes one of the main bottlenecks

snk_kid

1,312

September 26, 2005 06:48 AM

Quote: Original post by Anonymous Poster
STL incurs lots of hidden overheads, and while the templates are very flexible, they are never as good as a specialized implementation.

Well thats not generally true, while it is true that each implementation of the C++ standard library containers & algorithms will vary from each it's only a very minor variation and there are not "lots of hidden overheads". Infact on virtually all modern C++ compilers there implementations use advance idioms & optimization techniques that you have most likely never heard of before.

You have virtually no chance of out doing std::vector/list/deque/etc with your own custom written one, maybe getting roughly eqaul efficiency (because you know about the advance techniques & idioms used) but not any better. Unless you mean't something else about "specialized implementation" (like using platform-dependant features) that is a false statement. Besides the point all standard library containers are parameterized by allocator type so you can use custom memory models/schemes with them.

Quote: Original post by Anonymous Poster
What you really want to do is use a binary heap for storing your open and close lists... Also, make sure you store pointers to your nodes/cells.

In which case the standard library provides heap operations for random accessible containers.

[Edited by - snk_kid on September 26, 2005 7:48:40 AM]

A* optimisations?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

A* optimisations?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines