novel pathfinding techniques using neurodynamic programming
Dear All,
My good friend Dominic Driver at the University of Reading, England, has given me a paper he has written as part of his Masters Project (which is based on pathfinding techniques for games).
The paper discusses the use of NeuroDynamic Programming to build better pathfinding algorithms. He suggests that the possible advantages that this could pose, include using cover whilst moving to a specific location (without specifically programming the entity to move to that cover), finding areas of high ground, avoiding potential ambush areas etc. All contained within the pathfinding routine.
To read the article, it''s free at:
www.neuronlogic.com/literature.htm
Move to the section ''White papers'', and Dom''s should be the first one there, just select ''read it here'', and the paper should load.
Enjoy!!!
Matt
Hi Matt,
You might like to tell your friend that what he is thinking of doing has, to some extent, already been done. He should read: "A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes", by Kearns, Mansour and Ng. It''s available online (through citeseer I believe).
Also, the ideas he has about using lower transition costs near cover has some problems. These costs will depend on the known, or predicted location of the player, since there is little or no extra value in standing against a wall if the player is on the same side of it as you are (and can see you). He should check out the D* algorithm for dynamic path planning in variable cost environments.
Cheers,
Timkin
You might like to tell your friend that what he is thinking of doing has, to some extent, already been done. He should read: "A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes", by Kearns, Mansour and Ng. It''s available online (through citeseer I believe).
Also, the ideas he has about using lower transition costs near cover has some problems. These costs will depend on the known, or predicted location of the player, since there is little or no extra value in standing against a wall if the player is on the same side of it as you are (and can see you). He should check out the D* algorithm for dynamic path planning in variable cost environments.
Cheers,
Timkin
I'm just following the conversation and thought I'd drop a link to a set of white papers on D* (Tony Stentz) that I found the link to here: Game AI Resources: Pathfinding. // edit - The sparse sample paper can be found here: Andrew Ng's Homepage
[edited by - lessbread on September 18, 2002 11:44:32 PM]
[edited by - lessbread on September 18, 2002 11:44:32 PM]
"I thought what I'd do was, I'd pretend I was one of those deaf-mutes." - the Laughing Man
Timkin and others,
Thanks for your reply, I will pass on those references to Dom. The prject is currently at the research stage, and I believe he is looking to advance some areas of development if possible, and so if this has already been done then he may well look elsewhere. Although, I believe what he may also consider doing is to build a tool which will automatically ''detect'' points of tactical interest on a game map, and hence can be used in the map-building process to reduce the development time. Such a tool would reduce the requirement of map designers having to place waypoints all over the map, and would possibly improve the ''feel'' of the AI?
Regards and thanks for the feedback
Matt
Thanks for your reply, I will pass on those references to Dom. The prject is currently at the research stage, and I believe he is looking to advance some areas of development if possible, and so if this has already been done then he may well look elsewhere. Although, I believe what he may also consider doing is to build a tool which will automatically ''detect'' points of tactical interest on a game map, and hence can be used in the map-building process to reduce the development time. Such a tool would reduce the requirement of map designers having to place waypoints all over the map, and would possibly improve the ''feel'' of the AI?
Regards and thanks for the feedback
Matt
For the learning part your friend should consider policy sharing.
The map is already available and to each state in the map you can assign the values of possible actions that can be updated using Q-learning like techniques.
In this way the NPCs don''t have their own policy, but instead use the "map policy".
Then adding more NPC will speed up learning.
Learning these things will be very slow so every speedup is welcome....
The map is already available and to each state in the map you can assign the values of possible actions that can be updated using Q-learning like techniques.
In this way the NPCs don''t have their own policy, but instead use the "map policy".
Then adding more NPC will speed up learning.
Learning these things will be very slow so every speedup is welcome....
Extending smilydon''s thoughts...
One of the greatest difficulties in learning policies in dynamic domains is extrapolating local knowledge (in the 4-D space time) to global knowledge. For example, you notice a state change in a door (it''s now closed), but how does that affect your knowledge of the world on the other side of the door. It''s impossible to extrapolate or extend local observations unless you have a global model.
Having multiple agents building a shared policy eases this problem. Consider an n-state domain with n agents, each occupying one of the states. No matter how the domain changes, a complete model of the domain is available at any time by simply querying all of the agents (or having them deposit their observations in a central, shared database). Now decrease the number of agents and give them a region to learn. Make sure the regions overlap spatially, but not temporally (otherwise you''d have two agents trying to occupy the same state). It would be possible to write down an approximate value iteration/policy iteration formula (based on the limited horizon problem of each agent) for the optimal policy. This would be an interesting research problem and about right for Master''s level.
On another thought, you might want to have your friend contact Michael Kearns to see if anyone has implement his sparse sampling algorithm using an ANN. If they haven''t, then that is certainly a Masters Project.
Cheers,
Timkin
One of the greatest difficulties in learning policies in dynamic domains is extrapolating local knowledge (in the 4-D space time) to global knowledge. For example, you notice a state change in a door (it''s now closed), but how does that affect your knowledge of the world on the other side of the door. It''s impossible to extrapolate or extend local observations unless you have a global model.
Having multiple agents building a shared policy eases this problem. Consider an n-state domain with n agents, each occupying one of the states. No matter how the domain changes, a complete model of the domain is available at any time by simply querying all of the agents (or having them deposit their observations in a central, shared database). Now decrease the number of agents and give them a region to learn. Make sure the regions overlap spatially, but not temporally (otherwise you''d have two agents trying to occupy the same state). It would be possible to write down an approximate value iteration/policy iteration formula (based on the limited horizon problem of each agent) for the optimal policy. This would be an interesting research problem and about right for Master''s level.
On another thought, you might want to have your friend contact Michael Kearns to see if anyone has implement his sparse sampling algorithm using an ANN. If they haven''t, then that is certainly a Masters Project.
Cheers,
Timkin
Alright Lads,
Dom Driver here, author of the paper in question. I have to thank Matt for the promotion, but I must also stress that the ideas laid out in the paper have been advanced upon (theoretically) since it was written. What is said is true, but the idea now is to combine the ND pathing techniques with a proposed terrain recognition tool to produce a robust pathing system. The system should be able to determine cover IN RELATION TO the player, as well as lines of best approach/tactical significance. The tool will influence the Markov decision process relative to the NN being used for world approximations.
The tool should also identify those terrain features being most problematic to pathfinding algorithms (eg horseshoe shapes) and take steps to either simplify or ''pseudo-remove'' the offending obstacles. The combination of the ND methods and the design tool into a working system should be more than a challenge, and it remains to be seen what lines of investigation will be fully explored. In order to aid my broader understanding, I''m preparing a questionnaire for online participation. If I get permission, I''ll try and set it up here. If not, I''ll get back to you.
Thanks for the continued help
Dom Driver
Dom Driver here, author of the paper in question. I have to thank Matt for the promotion, but I must also stress that the ideas laid out in the paper have been advanced upon (theoretically) since it was written. What is said is true, but the idea now is to combine the ND pathing techniques with a proposed terrain recognition tool to produce a robust pathing system. The system should be able to determine cover IN RELATION TO the player, as well as lines of best approach/tactical significance. The tool will influence the Markov decision process relative to the NN being used for world approximations.
The tool should also identify those terrain features being most problematic to pathfinding algorithms (eg horseshoe shapes) and take steps to either simplify or ''pseudo-remove'' the offending obstacles. The combination of the ND methods and the design tool into a working system should be more than a challenge, and it remains to be seen what lines of investigation will be fully explored. In order to aid my broader understanding, I''m preparing a questionnaire for online participation. If I get permission, I''ll try and set it up here. If not, I''ll get back to you.
Thanks for the continued help
Dom Driver
Stroke me
Given that most terrains of interest are either complex, broad or both, how do you plan on efficiently creating an input space for your ANN while maintaining generalisation in your approach?
Cheers,
Timkin
Cheers,
Timkin
As a start point, I''ll try and get the system up and running, so the world will be just represented by a grid of states. These states will preliminarily be either impassable or passable. I know this is not as generic as I would like, but it is imperative to set up a test-bed for the theory in a simple way. The grid world would then be used to compare the ND pathing against other pathing methods and see how they fare in the same environment.
The terrain tool, which will use either an MLPs, Radial-Basis Function Nets or Support Vector Machines (or a combination), will scan overlapping grids of a certain size within the world grid. If a certain type of difficult terrain is recognised, steps will be taken to remedy the difficulties posed by traversing that section of terrain. Basically, it''s a pattern recognition problem, where multiple patterns are searched for. The exact format of this is as yet undecided, but this ''scanning'' for terrain features will occur as the map is created, not at run time. This will obviously mean that the remedies mentioned above will have to be complex enough to handle all possible situations - which may itself be impossible.
Don''t forget, I''m still researching, nothing is set yet.
I can''t create the generic AI I wanted (as mentioned elsewhere in this forum) becuase I simply don''t have the time. I need to start small, get the two-parts functioning, then expand to see if they can be made generic. If they can, I''ll continue my work. If they can''t, I''ll have created an interesting little toy for everyone to dissect and criticise!
The terrain tool, which will use either an MLPs, Radial-Basis Function Nets or Support Vector Machines (or a combination), will scan overlapping grids of a certain size within the world grid. If a certain type of difficult terrain is recognised, steps will be taken to remedy the difficulties posed by traversing that section of terrain. Basically, it''s a pattern recognition problem, where multiple patterns are searched for. The exact format of this is as yet undecided, but this ''scanning'' for terrain features will occur as the map is created, not at run time. This will obviously mean that the remedies mentioned above will have to be complex enough to handle all possible situations - which may itself be impossible.
Don''t forget, I''m still researching, nothing is set yet.
I can''t create the generic AI I wanted (as mentioned elsewhere in this forum) becuase I simply don''t have the time. I need to start small, get the two-parts functioning, then expand to see if they can be made generic. If they can, I''ll continue my work. If they can''t, I''ll have created an interesting little toy for everyone to dissect and criticise!
Stroke me
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement