I wrote - or, rather, re-wrote - my A* implementation today. Actually wrote it from scratch in just about 4 hours. To speed it up, I built the open list right into the node structure. I never have to search for the best open node; I needn't check a list of closed nodes. In fact, for a small search space, all the scratch space fits in my cache. In these images, a red square is an obstacle, a blue square is the path, and a green square is one that was searched. With a heuristic of 0 -

A max-delta heuristic - max( delta_x, delta_y )

Manhatten distance - delta_x + delta_y