Advertisement

Is this the most efficient way to implement A* in java for Dalvik VM?

Started by May 14, 2009 04:01 PM
28 comments, last by rbgrn 15 years, 6 months ago
Quote: Original post by rbgrn
I was using an ArrayList which is a managed array. The performance hits were from inserting in the middle of it and checking for the existence of an element in it. My new code is much faster for both of those operations.


Oh! I see. You were using a sorted array; now you're using an unsorted linked list. Yes, the latter will be faster. However, you should try using a heap-based priority queue, as it will be faster than either. You can do this efficiently with an array and a little indexing. The textbook binary heap is described here.

...or, don't implement it yourself, and just use a "priority queue" (which I'm sure Java provides), as it certainly uses some flavor of heap under the hood.
Quote: Original post by Sneftel
Yes, but by simply using a binary heap, you can avoid middle-insertion altogether. For that matter, by using an extra field in the tile class, you can also make checking for an item in the open-list O(1) too. Fix your algorithms first, then fix your implementation.


I just switched it to use Java's PriorityQueue with the most efficient comparator I could add and it was slower than my Linked List impl. Also I am checking for the open list item using an extra field so that's taken care of.

I see what you're saying - a priority queue should be faster, although I wonder how performance is affected by the removals? Could it just be that Sun's PriorityQueue is a slow implementation somehow?

Here's my first stab at using Java 5's PriorityQueue (slow, didn't work well).
import java.util.ArrayList;import java.util.Collections;import java.util.PriorityQueue;public class AStarHeap {	/** The queue of open AITiles to check. */	private PriorityQueue<AITile> openQueue = new PriorityQueue<AITile>(10, new AStarTileComparator());	/** The return path */	private ArrayList<AITile> path = new ArrayList<AITile>();	/** The map being searched */	private AITile[][] map;	/** The maximum depth of search we're willing to accept before giving up */	private int maxSearchDistance;	private int maxIterations;	private int mapWidth;	private int mapHeight;	/**	 * Create a path finder	 * 	 * @param heuristic	 *            The heuristic used to determine the search order of the map	 * @param map	 *            The map to be searched	 * @param maxSearchDistance	 *            The maximum depth we'll search before giving up	 */	public AStarHeap(AITile[][] map, int mapWidth, int mapHeight, int maxSearchDistance, int maxIterations) {		this.map = map;		this.mapWidth = mapWidth;		this.mapHeight = mapHeight;		this.maxSearchDistance = maxSearchDistance;		this.maxIterations = maxIterations;	}	/**	 * @see PathFinder#findPath(Mover, int, int, int, int)	 */	public ArrayList<AITile> findPath(int sx, int sy, int tx, int ty) {		// easy first check, if the destination is blocked, we can't get there		AITile[][] map = this.map;		ArrayList<AITile> path = this.path;		PriorityQueue<AITile> openQueue = this.openQueue;		int maxSearchDistance = this.maxSearchDistance;		int maxIterations = this.maxIterations;		int mapWidth = this.mapWidth;		int mapHeight = this.mapHeight;		AITile targetTile = map[tx][ty];				if (targetTile.state == AITile.CLOSED) {			return null;		}		// start the open list with the source tile.		AITile sourceTile = map[sx][sy];		sourceTile.cost = 0;		sourceTile.depth = 0;		openQueue.clear();		openQueue.add(sourceTile);		targetTile.pathParent = null;		// while we haven'n't exceeded our max search depth		int maxDepth = 0;		// reusable declarations for this loop and nested loops		AITile current;		int x, y;		int xp, yp;		float nextStepCost;		AITile neighbor;		int iterations = 0;		while ((iterations < maxIterations) && (maxDepth < maxSearchDistance) && (openQueue.size() > 0)) {			iterations++;			// pull out the first AITile in our open list, this is determined to			// be the most likely to be the next step based on our heuristic			current = openQueue.remove();			if (current == targetTile) {				break;			}			current.isAstarOpen = false;			current.isAstarClosed = true;			// search through all the neighbors of the current AITile evaluating			// them as next steps			for (x = -1; x < 2; x++) {				for (y = -1; y < 2; y++) {					// not a neighbor, its the current tile					if ((x == 0) && (y == 0)) {						continue;					}					// if we're not allowing diagonal movement then only					// one of x or y can be set					if ((x != 0) && (y != 0)) {						continue;					}					// determine the location of the neighbor and evaluate it					xp = x + current.x;					yp = y + current.y;					if (isValidLocation(map, sx, sy, xp, yp, mapWidth, mapHeight)) {						// the cost to get to this AITile is cost the current plus the movement						// cost to reach this AITile. Note that the heuristic value is only used						// in the sorted open list						nextStepCost = current.cost;						neighbor = map[xp][yp];						neighbor.visited = true;						// if the new cost we've determined for this AITile is lower than						// it has been previously makes sure the AITile hasn't						// determined that there might have been a better path to get to						// this AITile so it needs to be re-evaluated						if (nextStepCost < neighbor.cost) {							// remove from open list							neighbor.isAstarOpen = false;							openQueue.remove(neighbor);							// it's neither open nor closed.							neighbor.isAstarClosed = false;						}						// if the AITile hasn't already been processed and discarded the						// reset it's cost to our current cost and add it as a next possible						// step (i.e. to the open list)						if (!neighbor.isAstarClosed && !neighbor.isAstarOpen) {							neighbor.cost = nextStepCost;							neighbor.heuristic = getHeuristicCost(xp, yp, tx, ty);							maxDepth = Math.max(maxDepth, neighbor.setParent(current));							// System.out.println("neighbor [" + neighbour.x + "," + neighbour.y + "] now has parent ["							// + current.x + "," + current.y + "]");							neighbor.isAstarOpen = true;							openQueue.add(neighbor);						}					}				}			}		}		//System.out.println("A* Iterations =" + iterations);		// since we'e've run out of search		// there was no path. Just return null		if (targetTile.pathParent == null) {			return null;		}		// At this point we've definitely found a path so we can uses the parent		// references of the map to find out way from the target location back		// to the start recording the map on the way.		// thats it, we have our path		path.clear();		AITile target = targetTile;		while (target.x != sx || target.y != sy) {			path.add(target);			target = target.pathParent;		}		Collections.reverse(path);		return path;	}	/**	 * Check if a given location is valid for the supplied mover	 * 	 * @param mover	 *            The mover that would hold a given location	 * @param sx	 *            The starting x coordinate	 * @param sy	 *            The starting y coordinate	 * @param x	 *            The x coordinate of the location to check	 * @param y	 *            The y coordinate of the location to check	 * @return True if the location is valid for the given mover	 */	protected boolean isValidLocation(AITile[][] map, int sx, int sy, int x, int y, int mapWidth, int mapHeight) {		if ((x < 0) || (y < 0) || (x >= mapWidth) || (y >= mapHeight)) {			return false;		}		if (((sx != x) || (sy != y))) {			return (map[x][y].state != AITile.CLOSED);		}		return true;	}	/**	 * Get the heuristic cost for the given location. This determines in which order the locations are processed.	 * 	 * @param mover	 *            The entity that is being moved	 * @param x	 *            The x coordinate of the tile whose cost is being determined	 * @param y	 *            The y coordiante of the tile whose cost is being determined	 * @param tx	 *            The x coordinate of the target location	 * @param ty	 *            The y coordinate of the target location	 * @return The heuristic cost assigned to the tile	 */	public float getHeuristicCost(int x, int y, int tx, int ty) {		float dx = tx - x;		float dy = ty - y;		float result = (float) (Math.sqrt((dx * dx) + (dy * dy)) * 1.02f);		// stupid euclidean distance. try manhattan instead with 1/50th tiebreaker		// float result = (Math.abs(x - tx) + Math.abs(y - ty)) * 1.02f;		return result;	}}


[Edited by - rbgrn on May 15, 2009 1:13:34 PM]
Advertisement
I found a faster BinaryHeap implementation for Java and optimized it for the Dalvik VM but the savings really aren't that great. The search space is never much over 1000 tiles so the open list just doesn't get that big. I also removed this block of code because I don't feel like it's right. Can anyone tell me why I should keep this in the main part of my A*?

// if the new cost we've determined for this AITile is lower than						// it has been previously makes sure the AITile hasn't						// determined that there might have been a better path to get to						// this AITile so it needs to be re-evaluated						if (nextStepCost < neighbor.cost) {							// remove from open list							neighbor.isAstarOpen = false;							openQueue.remove(neighbor);							// it's neither open nor closed.							neighbor.isAstarClosed = false;						}


I thought about it and am confused as to why we'd ever need to re-open a tile for evaluation. I removed the block of code, which allowed for the faster binary heap to work, since I'm no longer removing elements that aren't on the head. Everything seems to work correctly, but I'm still not getting the performance I need.

Does it seem odd that 150 iterations (multiply by neighbor checks and that's around 600 calls to add to the heap, check heuristics and compare) is all I can fit in with 10ms of CPU time?

ARMs don't have FPUs so maybe my heuristic function is too expensive? Any ideas of where to look next?

Thanks for all your help so far!
Quote: Original post by rbgrn
Quote: Original post by Ezbez
Are emulators a valid measurement of performance? I couldn't imagine that they would be, but I have zero experience in the matter.


Yes! The devkit emulator is very comparable to the real device. The actual device may be slightly faster or slower in some scenarios but overall they got the ARM emulation down to where it's very reliable as a baseline.


Wow, that's a really impressive feature!
Change your getHeuristicCost function to use FP math rather than floats. Also the sqrt can't be helping much either.
Quote: Original post by rbgrn
Could it just be that Sun's PriorityQueue is a slow implementation somehow?
It's possible. The bigger issue is that PriorityQueue.remove() is O(n). By implementing your own priority queue you can keep an extra variable in Tile saying where in the open-list that tile currently is, to cut down on that time. For best performance, implement a Fibonacci heap, and use the "decrease cost" operation instead of the "delete" operation to update the open list when a quicker path to a node is found. That makes that update O(1). Fibonacci heaps are a pain to implement, but at least they're not as bad as red-black trees.
Advertisement
Quote: Original post by Buster2000
Change your getHeuristicCost function to use FP math rather than floats. Also the sqrt can't be helping much either.

Wow, I didn't even notice that. Yeah, for something this performance-critical, particularly on an embedded platform, you want to avoid floating point as much as possible. As you mentioned in your comments, Manhattan distance might be preferable to avoid the sqrt, though you'd have to see if that leads to expanding too many nodes.

Also, for the life of me I cannot find where you're incrementing the node cost by the movement cost.
Quote: Original post by rbgrn
I thought about it and am confused as to why we'd ever need to re-open a tile for evaluation.

It only happens if your heuristic is not consistent (which it is, for the euclidean case with no wormholes). Soooo... just ignore my message from before about Fibonacci heaps.
I don't know about Android, but the ARM CPUs in the iPhone and newer Symbian devices definitly have FPUs which are significantly quicker than using fixed point...
Quote: Original post by BigJim
I don't know about Android, but the ARM CPUs in the iPhone and newer Symbian devices definitly have FPUs which are significantly quicker than using fixed point...


There are a few phones now but the baseline is the HTC Dream, sold as the T-Mobile G1, which uses Qualcomm Integrated CPU/GPU with a ARM1136EJ-S at its core.

I read some discussion in more than one place that says the ARM1136EJ-S does not have a dedicated FPU but instead relies on some sort of virtual floating point processing.

It does run at between 400Mhz and 500Mhz so one would think you could do a reasonable number of floating point operations regardless.

This topic is closed to new replies.

Advertisement