Advertisement

Is this the most efficient way to implement A* in java for Dalvik VM?

Started by May 14, 2009 04:01 PM
28 comments, last by rbgrn 15 years, 6 months ago
Quote: Original post by Sneftel
Quote: Original post by Buster2000
Change your getHeuristicCost function to use FP math rather than floats. Also the sqrt can't be helping much either.

Wow, I didn't even notice that. Yeah, for something this performance-critical, particularly on an embedded platform, you want to avoid floating point as much as possible. As you mentioned in your comments, Manhattan distance might be preferable to avoid the sqrt, though you'd have to see if that leads to expanding too many nodes.

Also, for the life of me I cannot find where you're incrementing the node cost by the movement cost.


It appears as though I haven't been incrementing them. I've been using just the heuristic cost. That could be a problem!
I've now got this as optimized as I can using the binary heap and doing as little FP as I can get away with but it's just not getting the performance I need. What are your thoughts on these techniques from the gamasutra article here

In particular, this:


A Faster Implementation of the Standard A*

Before proceeding with turning and smoothing modifications to the A* algorithm, let's start with some basic optimizations that can speed up the standard A* algorithm by a factor of 40 or more. To start with, the standard A* algorithm uses a sorted linked list (or two linked lists) to track nodes that are checked. Instead, we'll use a 60x60 fixed matrix. When starting a search from point a to point b, we find the midpoint between those two and place it at point [30,30] on our matrix. Each point on the matrix stores:

*
The cost to get to the point
*
The total cost through that point to the goal
*
The [x,y] location of its "parent" tile (the tile before it on the path)
*
A Boolean stating whether or not it is on the "Open" list of actively pursued nodes, and
*
The [x,y] locations of the Previous and Next nodes in the Open list.

We also keep a separate array of 1-bit Booleans, which store whether or not each node in our matrix has been touched yet during this search. That way, we can very rapidly initialize at the beginning of the search without needing to clear the entire matrix.

Whereas the original algorithm maintains a separate sorted Open list (actually a Priority Queue), we instead maintain basic list functionality simply by using Previous and Next pointers within the fixed array. Note that we do have the memory requirement for our 60x60 matrix, but our compacted data structure requires only 16 bytes per node, for a total of 57K. (Even expanding the matrix to 120x120 will only require 230K of memory.)

Note additionally that the "list" can be implemented as a binary tree (by having two Next node pointers at each element), but we've actually found it to be substantially faster to have a simple (non-priority) list. While this does result in time O(n) for the search for the lowest cost node at the top of the A* loop (rather than O(log n) for a priority queue), it excels in that all insertions and deletions, of which there are many, are only O(1). Best of all, it eliminates the inner loop search that checks if neighboring nodes yet exist on the Open or Closed lists, which otherwise would take O(n) (or maybe a bit better if a hash table is used).

Overall, by avoiding all memory allocations and list insertions, this method turns out to be dramatically faster. I have profiled it to be as much as 40 times faster than standard A* implementations.

Note that for the Directional search described later in this article, eight times the number of nodes are necessary, so the memory requirement will all increase by a factor of eight.
Advertisement
Ok, here's my very most optimized recent code which uses a BinaryHeap implementation that I found somewhere and will post on here. I modified the BinaryHeap to run faster under Dalvik, just by creating local references to member fields, etc. Dalvik has no JIT, so all of this stuff matters.

This runs better than anything so far but I still feel like there should be ways to make it faster. I'm assuming that at this point, it'd take a fundamentally different approach to dealing with the data structures to make any real difference.

Thanks for all your help so far! I've learned a lot and my game has improved as a result.

If anyone has any additional suggestions for ways to speed this up still, I'm all ears.

Thanks!

import java.util.ArrayList;import java.util.Collections;public class AStarBHeap {	/** The heap of open AITiles to check. */	private BinaryHeap openHeap = new BinaryHeap();	/** The return path */	private ArrayList<AITile> path = new ArrayList<AITile>();	/** The map being searched */	private AITile[][] map;	/** The maximum depth of search we're willing to accept before giving up */	private int maxIterations;	private int mapWidth;	private int mapHeight;	/**	 * Create a path finder	 * 	 * @param heuristicCost	 *            The heuristic used to determine the search order of the map	 * @param map	 *            The map to be searched	 * @param maxSearchDistance	 *            The maximum depth we'll search before giving up	 */	public AStarBHeap(AITile[][] map, int mapWidth, int mapHeight, int maxIterations) {		this.map = map;		this.mapWidth = mapWidth;		this.mapHeight = mapHeight;		this.maxIterations = maxIterations;	}	/**	 * @see PathFinder#findPath(Mover, int, int, int, int)	 */	public ArrayList<AITile> findPath(int sx, int sy, int tx, int ty) {		// easy first check, if the destination is blocked, we can't get there		AITile[][] map = this.map;		ArrayList<AITile> path = this.path;		BinaryHeap openHeap = this.openHeap;		int maxIterations = this.maxIterations;		int mapWidth = this.mapWidth;		int mapHeight = this.mapHeight;		AITile targetTile = map[tx][ty];				if (targetTile.state == AITile.CLOSED) {			return null;		}		// start the open list with the source tile.		AITile sourceTile = map[sx][sy];		sourceTile.movementCost = 0;		sourceTile.totalCost = 0;		openHeap.clear();		openHeap.add(sourceTile);		targetTile.pathParent = null;		// reusable declarations for this loop and nested loops		AITile current;		int x, y;		int xp, yp;		int dx, dy;		int nextStepCost;		AITile neighbor;		int iterations = 0;		while (iterations < maxIterations) {			iterations++;			// pull out the first AITile in our open list, this is determined to			// be the most likely to be the next step based on our heuristic			current = openHeap.deleteMin();			if (current == targetTile || current == null) {				break;			}			current.isAstarOpen = false;			current.isAstarClosed = true;			// search through all the neighbors of the current AITile evaluating			// them as next steps			for (x = -1; x < 2; x++) {				for (y = -1; y < 2; y++) {					// not a neighbor, its the current tile					if ((x == 0) && (y == 0)) {						continue;					}					// if we're not allowing diagonal movement then only					// one of x or y can be set					if ((x != 0) && (y != 0)) {						continue;					}					// determine the location of the neighbor and evaluate it					xp = x + current.x;					yp = y + current.y;					// if the tile is off the grid					if (xp < 0 || yp < 0 || xp >= mapWidth || yp >= mapHeight) {						continue;					}					neighbor = map[xp][yp];					// if it's not the source tile, it can't be in a closed state.					if ((sx != xp || sy != yp) && neighbor.state == AITile.CLOSED) {						continue;					}					// the cost to get to this AITile is cost the current plus the movement					// cost to reach this AITile. Note that the heuristic value is only used					// in the sorted open list					nextStepCost = current.movementCost + 1;					// if the AITile hasn't already been processed and discarded the					// reset it's cost to our current cost and add it as a next possible					// step (i.e. to the open list)					if (!neighbor.isAstarClosed && !neighbor.isAstarOpen) {						neighbor.movementCost = nextStepCost;						// heuristic cost						dx = tx - xp;						dy = ty - yp;						// totalCost = accumulated movement cost + heuristic cost						neighbor.totalCost = nextStepCost + (float) Math.sqrt((dx * dx) + (dy * dy)) * 1.02f;						neighbor.pathParent = current;						// System.out.println("neighbor [" + neighbour.x + "," + neighbour.y + "] now has parent ["						// + current.x + "," + current.y + "]");						neighbor.isAstarOpen = true;						openHeap.add(neighbor);					}				}			}		}		//System.out.println("A* Iterations =" + iterations);		// since we'e've run out of search		// there was no path. Just return null		if (targetTile.pathParent == null) {			return null;		}		// At this point we've definitely found a path so we can uses the parent		// references of the map to find out way from the target location back		// to the start recording the map on the way.		// thats it, we have our path		path.clear();		AITile target = targetTile;		while (target.x != sx || target.y != sy) {			path.add(target);			target = target.pathParent;		}		Collections.reverse(path);		return path;	}}public class AITile implements Comparable<AITile> {	public static final int OPEN = 0;	public static final int CLOSED = 1;	public static final int SELF = 2;	public static final int TARGET = 3;	public static final int INVALID = 4;		public int state = OPEN;	public int topLeftX;	public int topLeftY;	public int width;	public int height;		public int x;	public int y;	// A* Stuff	/** The parent of this node, how we reached it in the search */	public AITile pathParent;	/** The movement cost of this node */	public int movementCost;	/** totalCost = movementCost + heuristicCost */	public float totalCost;	public boolean isAstarClosed;	public boolean isAstarOpen;	public AITile openNext;	public AITile openPrev;		public void resetAStar() {		x = 0;		y = 0;		pathParent = null;		movementCost = 0;		totalCost = 0f;		isAstarClosed = false;		isAstarOpen = false;		openNext = null;		openPrev = null;	}		public int compareTo(AITile other) {		float hc1 = totalCost;		float hc2 = other.totalCost;		if (hc1 > hc2) {			return 1;		} else if (hc2 > hc1) {			return -1;		}		return 0;	}}public class BinaryHeap {	private static final int DEFAULT_CAPACITY = 100;	private int currentSize; // Number of elements in heap	private AITile[] array; // The heap array	/**	 * Construct the binary heap.	 */	public BinaryHeap() {		currentSize = 0;		array = new AITile[DEFAULT_CAPACITY + 1];	}	public void clear() {		currentSize = 0;		array = new AITile[DEFAULT_CAPACITY + 1];	}		/**	 * Construct the binary heap from an array.	 * 	 * @param items	 *            the inital items in the binary heap.	 */	public BinaryHeap(AITile[] items) {		currentSize = items.length;		int currentSize = this.currentSize;		array = new AITile[currentSize + 1];		AITile[] array = this.array;		for (int i = 0; i < currentSize; i++) {			array = items<span style="font-weight:bold;">;<br>		}<br>		buildHeap();<br>	}<br><br>	<span class="java-comment">/**<br>	 * Insert into the priority queue. Duplicates are allowed.<br>	 * <br>	 * @param x<br>	 *            the item to insert.<br>	 * @return null, signifying that decreaseKey cannot be used.<br>	 */</span><br>	<span class="java-visibilitymodifier">public</span> AITile add(AITile x) {<br>		AITile array[] = <span class="java-keyword">this</span>.array;<br>		<span class="java-keyword">if</span> (currentSize + <span class="java-number">1</span> == array.length) {<br>			doubleArray();<br>		}<br><br>		<span class="java-comment">// Percolate up</span><br>		<span class="java-primitives">int</span> hole = ++currentSize;<br>		array[<span class="java-number">0</span>] = x;<br><br>		<span class="java-keyword">for</span> (; x.compareTo(array[hole / <span class="java-number">2</span>]) &lt; <span class="java-number">0</span>; hole /= <span class="java-number">2</span>)<br>			array[hole] = array[hole / <span class="java-number">2</span>];<br>		array[hole] = x;<br><br>		<span class="java-keyword">return</span> <span class="java-keyword">null</span>;<br>	}<br><br>	<span class="java-comment">/**<br>	 * Find the smallest item in the priority queue.<br>	 * <br>	 * @return the smallest item or null if empty.<br>	 */</span><br>	<span class="java-visibilitymodifier">public</span> AITile findMin() {<br>		<span class="java-keyword">if</span> (isEmpty()) {<br>			<span class="java-keyword">return</span> <span class="java-keyword">null</span>;<br>		}<br>		<span class="java-keyword">return</span> array[<span class="java-number">1</span>];<br>	}<br><br>	<span class="java-comment">/**<br>	 * Remove the smallest item from the priority queue.<br>	 * <br>	 * @return the smallest item or null if empty.<br>	 */</span><br>	<span class="java-visibilitymodifier">public</span> AITile deleteMin() {<br>		AITile minItem = findMin();<br>		AITile array[] = <span class="java-keyword">this</span>.array;<br>		array[<span class="java-number">1</span>] = array[currentSize–];<br>		percolateDown(<span class="java-number">1</span>);<br><br>		<span class="java-keyword">return</span> minItem;<br>	}<br><br>	<span class="java-comment">/**<br>	 * Establish heap order property from an arbitrary arrangement of items. Runs in linear time.<br>	 */</span><br>	<span class="java-visibilitymodifier">private</span> <span class="java-primitives">void</span> buildHeap() {<br>		<span class="java-primitives">int</span> currentSize = <span class="java-keyword">this</span>.currentSize;<br>		<span class="java-keyword">for</span> (<span class="java-primitives">int</span> i = currentSize / <span class="java-number">2</span>; i &gt; <span class="java-number">0</span>; i–)<br>			percolateDown(i);<br>	}<br><br>	<span class="java-comment">/**<br>	 * Test if the priority queue is logically empty.<br>	 * <br>	 * @return true if empty, false otherwise.<br>	 */</span><br>	<span class="java-visibilitymodifier">public</span> <span class="java-primitives">boolean</span> isEmpty() {<br>		<span class="java-keyword">return</span> currentSize == <span class="java-number">0</span>;<br>	}<br><br>	<span class="java-comment">/**<br>	 * Returns size.<br>	 * <br>	 * @return current size.<br>	 */</span><br>	<span class="java-visibilitymodifier">public</span> <span class="java-primitives">int</span> size() {<br>		<span class="java-keyword">return</span> currentSize;<br>	}<br><br>	<span class="java-comment">/**<br>	 * Make the priority queue logically empty.<br>	 */</span><br>	<span class="java-visibilitymodifier">public</span> <span class="java-primitives">void</span> makeEmpty() {<br>		currentSize = <span class="java-number">0</span>;<br>	}<br><br>	<span class="java-comment">/**<br>	 * Internal method to percolate down in the heap.<br>	 * <br>	 * @param hole<br>	 *            the index at which the percolate begins.<br>	 */</span><br>	<span class="java-visibilitymodifier">private</span> <span class="java-primitives">void</span> percolateDown(<span class="java-primitives">int</span> hole) {<br>		<span class="java-primitives">int</span> child;<br>		AITile array[] = <span class="java-keyword">this</span>.array;<br>		<span class="java-primitives">int</span> currentSize = <span class="java-keyword">this</span>.currentSize;<br>		AITile tmp = array[hole];<br><br>		<span class="java-keyword">for</span> (; hole * <span class="java-number">2</span> &lt;= currentSize; hole = child) {<br>			child = hole * <span class="java-number">2</span>;<br>			<span class="java-keyword">if</span> (child != currentSize &amp;&amp; array[child + <span class="java-number">1</span>].compareTo(array[child]) &lt; <span class="java-number">0</span>)<br>				child++;<br>			<span class="java-keyword">if</span> (array[child].compareTo(tmp) &lt; <span class="java-number">0</span>)<br>				array[hole] = array[child];<br>			<span class="java-keyword">else</span><br>				<span class="java-keyword">break</span>;<br>		}<br>		array[hole] = tmp;<br>	}<br><br>	<span class="java-comment">/**<br>	 * Internal method to extend array.<br>	 */</span><br>	<span class="java-visibilitymodifier">private</span> <span class="java-primitives">void</span> doubleArray() {<br>		AITile[] newArray;<br>		AITile array[] = <span class="java-keyword">this</span>.array;<br>		<span class="java-primitives">int</span> arrayLength = array.length;<br>		newArray = <span class="java-keyword">new</span> AITile[arrayLength * <span class="java-number">2</span>];<br>		<span class="java-keyword">for</span> (<span class="java-primitives">int</span> i = <span class="java-number">0</span>; i &lt; arrayLength; i++) {<br>			newArray<span style="font-weight:bold;"> = array<span style="font-weight:bold;">;<br>		}<br>		<span class="java-keyword">this</span>.array = newArray;<br>	}<br>}<br><br><br></pre></div><!–ENDSCRIPT–> 
The other direction you can go is hierarchical A* (see Game Programing Gems for an example), if your world is laid out in a hierarchical manner. In a related vein, you can use visibility graphs to speed up pathing through open space. And in an unrelated vein, you can use an anytime version of A* (google ARA*) to find as smart a path in as much time as you actually have.
Quote: Original post by Sneftel
The other direction you can go is hierarchical A* (see Game Programing Gems for an example), if your world is laid out in a hierarchical manner. In a related vein, you can use visibility graphs to speed up pathing through open space. And in an unrelated vein, you can use an anytime version of A* (google ARA*) to find as smart a path in as much time as you actually have.


The world is a fixed-cost, 1440 tile max 2d plane. If you watch
">this video
, you'll see exactly what the world looks like.

Does that spark any ideas or should I still consider hierarchical A* or ARA*? I'm almost happy with the performance right now and believe that if I use some pathfinding management in my AI code, I can make use of partial paths and won't even necessarily have to update the path every tick.
Hold up -- your world is dynamic? You're repathing every frame? D*! D*! (D*)
Advertisement
Quote: Original post by Sneftel
Hold up -- your world is dynamic? You're repathing every frame? D*! D*! (D*)


Ahhh! Now things come together. Didn't you think it was strange that I wanted my A* to run in 8ms on a mobile device? I came up with a CPU budget:

2 AI players, each running an A* algorithm at 10FPS can occupy no more than 200ms of CPU time, so that means 10ms * 10 updates each.

Now I have to roll up my sleeves and learn about how I can use D* stuff... Did you have a look at the video? Does that look like what D* is made for to you? The world is dynamic, in that the trails coming off the players are obstacles, so the world tends to look like a maze after a bit of playing.
Quote: Original post by rbgrn
Now I have to roll up my sleeves and learn about how I can use D* stuff... Did you have a look at the video? Does that look like what D* is made for to you?
I did. It's quite nearly one of the things D* is made for (the other thing being incomplete knowledge of the world). The reason it isn't entirely right for D* is because it looks like the pathfinding can have tactical as well as time-of-arrival considerations. That's just as much a problem with A*, though, so I assume you're not overly concerned with it.

The geometric programmer in me wants to model the evolving world as a visibility graph. It would be fascinating to do it that way, but it would also be breathtakingly complicated and possibly no faster and actually probably slower. Sooo.. hm.
Quote: Original post by Sneftel
Quote: Original post by rbgrn
Now I have to roll up my sleeves and learn about how I can use D* stuff... Did you have a look at the video? Does that look like what D* is made for to you?
I did. It's quite nearly one of the things D* is made for (the other thing being incomplete knowledge of the world). The reason it isn't entirely right for D* is because it looks like the pathfinding can have tactical as well as time-of-arrival considerations. That's just as much a problem with A*, though, so I assume you're not overly concerned with it.

The geometric programmer in me wants to model the evolving world as a visibility graph. It would be fascinating to do it that way, but it would also be breathtakingly complicated and possibly no faster and actually probably slower. Sooo.. hm.


Ok so I've taken your advice and I started implementing the algorithm defined as ADA* in this paper but as I was working on it I realized that it might not be more efficient. It seems to work very well for a dynamic environment because the states are maintained between ticks and so it only needs to check on things that have changed (inconsistent states), but I'm wondering if changing both the start and goal will cause every key to need to be recalculated since the key is dependent on the movement cost+h to the goal? This algorithm clearly works well for navigating to a fixed goal through a dynamic world, as shown by the examples using E of 2.5, 1.5 and 1, which is what I was also going to do with it, but I'm not sure if it's going to work correctly with the goal moving, which mine does.

I'm going to go ahead and keep working on my implementation, as I think I could have this working within the next few hours, but I'd really value your opinions here.

Thanks!

[Edited by - rbgrn on May 15, 2009 10:08:34 PM]
Ack. I think I'm out of time on this. I'm realizing that it could take me a good couple of days to get the ADA* algorithm working well. I simply don't have that kind of time left.

Thanks for all your help! I will make due with what I have for now and will try to schedule some time to work on this more before the next version.

I'm also going to post to see if anyone happens to have a working ADA* implementation. I couldn't find a single one anywhere online.

This topic is closed to new replies.

Advertisement