Advertisement

Implementing Graph Searches

Started by September 26, 2002 09:04 AM
8 comments, last by Mnsr Cola 22 years, 1 month ago
Hello, I am getting very confused trying to implement(in C++) a simple breadth first search of a graph. I have difficulties in two areas: a) Do I create new nodes (via the ''new'' operator) to put in the queue or do I use the existing nodes that comprise the graph? b)How do I keep a record of previously visited nodes. Everytime I try to think about the implementation my mind starts hurting! I would be very grateful if any of you could help me out. Also, I''d be grateful if you could direct me to any C++ code for this type of search which I can peruse. (all I''ve seen is pseudo code so far) I realize there are much better algorithms than breadth-first but I really want to understand this before I move on to A* etc. Thanking you all for any help I may receive
Well, for starters I would have asked this in another forum. Not because it doesn''t fit here; but because it also fits in general programming and you might get more responces in general programming.

To address your questions:

a.) No, you do not create new nodes to put in the queue. Remember, these are ''bookmarks'' to where you left off. You don''t want to create a new page, just be able to get back to the one you left off at (if that makes any sence at all).

b.) You don''t. Well, really you do - but it is built into the algorithm (and data structure).

Try to create a depth-first search if it is any less confusing for you (most people find depth-first simpler to visualize anyway). Then all you really have to do use a queue not a stack.

Anyway - you really shouldn''t need sample code (and I am much to lazy to write any now). If you find the concpets difficult go back and look at traversing trees since many of the same principles apply (and they are a bit easier).

-mongrelprogrammer
- I hate these user ratings. Please rate me down. (Seriously) -
Advertisement
Actually, I''d offer a different answer to mongrelprogrammer...

a) It depends on how you''ve structured your approach to the problem. If, for instance, you are growing your graph as you go, then yes, you would be adding new nodes (new leaves to the tree) each time you branch on a given node. If however you already have a graph structure defined in memory, then you wouldn''t be adding new nodes. This is what I believe mongrelprogrammer interpreted from your post.

Which are you doing, expanding tree or fixed graph?

b) There are many ways to keep track of previously expanded nodes. Generally speaking a list of nodes, or node IDs is used for trees. For graphs, a flag in the node can be used to signify that the node has been expanded (if the flag is numbered, then it''s depth can be recorded as well). A common method - and one that you''ll find useful if you''re going to move onto A* - is to move nodes that have been expanded (have children) onto a CLOSED list. This list is still used to verify shortest paths against new nodes.

Here''s a general algorithm for breadth first search of expanding graphs (trees). The only variation needed to this algorithm for other forms of search is the insert and extraction list operations.
Create two lists, OPEN and CLOSEDInsert StartNode into OPENWHILE(not(empty(OPEN))) do   remove first node in OPEN and set as NextBranch   IF NextBranch=Goal      return ancestors of NextBranch   ENDIF   Add Next Branch to CLOSED   FOR i=1 to branching_factor      Create child(i) of NextBranch      compute path cost from StartNode to child(i)      IF child(i) already in OPEN         keep version with lowest path cost      ELSIF child(i) already in CLOSED         keep version with lowest path cost      ELSE         Add child(i) to end of OPEN   ENDFOREND WHILE 


Each node needs to keep track of its children, so an array of pointers to children nodes is useful, as well as keeping track of it''s parent (a pointer to the parent node). When replacing a node in the OPEN and/or closed list with a newly created child node, you need to make sure you set the correct parent of the node and children if the node is in the closed list.

If you don''t expect your tree to have joins (i.e., sub-trees connected at points other than their common branching point) then you can remove the cost computation and comparison based on cost. Then it becomes a simple brute force expansion of the tree to find the goal node.

I hope this helps.

Cheers,

Timkin
quote: Original post by Mnsr Cola
a) Do I create new nodes (via the ''new'' operator) to put in the queue or do I use the existing nodes that comprise the graph?

You can do it whichever way is easiest for you. Personally I create separate search node objects. This makes it easier for me to generalise the searching to things other than maps, such as turn-based game decision making. It also means that the existing graph nodes don''t need to have extra memory space that is only ever used during pathfinding. Finally, if you start doing more advanced searches, you might end up having more than 1 valid way of reaching a given node (eg. from different directions), so this won''t be easy to store as part of the original node.

So I would recommend - use a separate object where possible, and only merge it in with the underlying structure if you really need the performance gains. Once you switch to A*, you probably won''t need those gains, especially if you preallocate the search node objects first.

quote: b)How do I keep a record of previously visited nodes.

Here''s my implementation:

typedef std::list<PathNode*> PathNodeQueue;
PathNodeQueue closedList;

A PathNode is the separate node-type that I use. I just store pointers to them for now, and it all works fine. I just push the PathNode* onto that list when I finish generating its successor nodes. The OpenList is pretty much the same, except I obviously call ''new'' to create the PathNode before I push it on there.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost | Asking Questions | Organising code files | My stuff ]
Thank you genetlemen for your responses

I''m still very confused though (please be patient with me). Assume that every edge traversal is simply a unit cost. Now picture a simple graph of 5 nodes A,B,C,D,E

A is connected to B,C
B is connected to A,C,D
C is connected to A,B,D
D is connected to B,C
E is connected to E

Now, if you follow your algorithm, nodes will be duplicated on the open list but with ^different^ parents. This cannot be so if, as mongrelprogrammer states, nodes do not need to be created. This, I believe, is the root of my confusion. (and the reason I''d like to see a C++ implementation. Alas, I have still to find one through Google)


For code, check out either CMU Artificial Intelligence Repository, or try this library

quote:
AISEARCH is a C++ class library for search algorithms implemented by Peter Bouthoorn . It includes implementations of DFS, BFS, uniform cost, best-first, bidirectional DFS/BFS, and AND/OR DFS/BFS search algorithms. It is available by anonymous ftp from obelix.icce.rug.nl:/pub/peter/ as aisearch.zip or aisearch.tar.Z.


If you do, let us know what you think of it please. I haven''t had time to check it out myself...


As to your sample graph... node E isn''t connected to anything else, so it''s superfluous. I take it that is a typo!

Let''s assume that you want to start at node A and find the shortest path to node D. Your search tree would look something like this...

You expand node A to find its children, generating the subtree
       A      / \     /   \    B     C 

If you had an open and a closed list, CLOSED would now contain node A and OPEN would contain nodes B and C. Assuming B went onto OPEN before C, we expand about node B, producing
       A      / \     /   \    B     C   /|\  / | \ A  D  C(2) 


CLOSED has nodes {A,B} and OPEN has nodes {C,D} (A is not in OPEN since it already exists in CLOSED. Also, note that C(2), the second occurrence of C, is already in OPEN so we do not add it to OPEN again, because the path cost to C(2) is 2, while the path cost to C is 1.). Following the algorithm above, we expand node C, because is was put in OPEN before node D was, so we get
         A        / \       /   \      /     \     /       \    B         C   /|\       /|\  / | \     / | \ A  D  C   D  A  B 

Now, CLOSED contains {A,B,C} and OPEN contains {D}. No new nodes are created in the tree as they already exist. The algorithm attepmts to expand about node D, because it is at the top of the OPEN list, but finds that this is the goal and so returns the ancestors of this version of node D, which are B and A. Thus, the path from A to the goal is the reverse of D+Ancestors(D), which is A,B,D.

I hope this helps clarify things for you. Try out some more complex examples on paper for yourself to get the hang of the algorithm. Once you have the hang of that, you can look at depth-first and best-first search algorithms!

Cheers,

Timkin
Advertisement
Mnsr Cola, to put things in different words (because it sometimes helps) - your search nodes don''t really have different ''parents'' to the original graph, as the original graph doesn''t have parents as such. Instead, it has neighbours/siblings/adjacent nodes/whatever you want to call them. The search algorithm itself uses the parent/child concept because you''re travelling in a certain direction each time. For each step, where you start is the parent and where you end up is the child. But these links between nodes are limited to being the same ones as you already specified.

Timkin, Bouthoorn''s library is quite comprehensive, and was deliberately designed for use with knowledge representation networks and predicate logic data structures as well as the more common pathfinding and general state-space search. I expect it works ok since it was distributed as part of his object-oriented AI book and therefore has had plenty of eyes reviewing the code, but it certainly isn''t very good for a new AI programmer to learn from as the code is (necessarily) very generic. Just my opinion, of course.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost | Asking Questions | Organising code files | My stuff ]
I''m not sure if it''ll help you any, but the boost library has some classes for graphs and traversal. They have methods of traversing them breadth-first too.

I''ve never actually used them so I can''t offer any more thoughts on them.

http://www.boost.org/libs/graph/doc/table_of_contents.html

Hope this helps.

Helpful links:
How To Ask Questions The Smart Way | Google can help with your question | Search MSDN for help with standard C or Windows functions
I think I am beginning to understand now. Sometimes one needs these things broken down step by step.

Because graph traversals in general use a different node type for the search (as Kylotan says, using parents/siblings and cost) and for the actual graph(just neighbours and edges) I guess it would be wise to utilise the former when doing searches rather than expanding the fuctionality of the latter. In short, to use two classes/structs instead of one. Is this correct?

Thank you gentlemen, for your time and patience.
Basically, your initial ''graph'' stores the layout of your data. But the traversal process itself needs its own data, which can be thought of as a graph itself, or more precisely a tree. So it does make sense to have different node types for these purposes, and to use 2 classes instead of one.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost | Asking Questions | Organising code files | My stuff ]

This topic is closed to new replies.

Advertisement