Monte Carlo Tree Search

Started by willh February 01, 2012 05:33 PM

4 comments, last by willh 12 years, 9 months ago

Author

160

February 01, 2012 05:33 PM

This post is related to a comment Alvaro made in a thread about AI in games of perfect information.

I'm interested in implementing Monte Carlo Tree search for a game that involves random outcomes. For some moves the outcome is fixed. For others there could be 6 or more different outcomes depending on some unknown value. The probabilities of each outcome occuring can change as the game progresses-- in this way it is similar to Poker.

I understand MCTS in terms of growing the tree for moves with a fixed outcome, but I'm confused about nodes that have a random outcome. Do I try to expand these nodes, or do I simply use the node to as a starting point for a number of simulated games?

If they are expanded, how? Is each possible outcome explored as a separate node? Are the probability distributions stored, and the node essentially 'randomized' each time it is visited (which would totally mess up the sanity of child nodes?) I can see the value in expanding them (to a degree) but it seems extermely risky given the nature of UCB-- it may attach priority to a very unlikely outcome.

Any help would be very much appreciated. Thanks!

Will

alvaro

21,607

February 01, 2012 07:11 PM

You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node.

Does that clear up all your questions?

IADaveMark

3,839

February 02, 2012 02:34 PM

Put another way, those branches of the tree represent "what could possibly happen based on the relative likelihood that they will happen."

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

willh

Author

160

February 02, 2012 03:34 PM

You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node. Does that clear up all your questions?

I think so.. So to clarify, I expand until I hit a 'chance' node. When I hit a chance node, I roll the dice, and expand as if that were the move. The next time I run in to that node, I roll the dice again.

alvaro

21,607

February 02, 2012 03:53 PM

The way I have done it before, the board structure knows whose turn it is. I consider die rolls to be moves by a player called "Fortuna" (my wife is a classicist). The UCT tree expands a position where Fortuna is to move just as it does any other position. When I store the statistics for each move (wins and visits), the wins are stored from the point of view of the player to move. Since Fortuna doesn't care who wins, I always count the result of the playout as 0.5 from her point of view. The UCB rule will then always select a move among the ones that have been explored the least. I add a bit of randomness to the UCB score of each move so we don't always explore "1" first or have some other weird bias like that.

EDIT: Of course, in the non-UCT part of the playout, Fortuna picks random moves.

willh

Author

160

February 03, 2012 05:03 AM

Thanks Alvaro. Very helpful.

Monte Carlo Tree Search

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Monte Carlo Tree Search

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines