Advertisement

Alpha-Beta Pruning in JavaScript - Help!

Started by May 20, 2010 01:31 AM
4 comments, last by alvaro 14 years, 8 months ago
Hello all! I'm trying to use alpha-beta pruning for AI in a very simple &#106avascript game, but it's causing the computer player to make pretty bad choices. Rules: Player take turns choosing numbers between 1 and 6. Each number chosen is added to a running total. If a player brings the total up to 21, he/she earns 2pts. That's the ONLY way to earn points, so whoever hits 21 "wins". Problem: As the running total approaches 21, the computer equally favors every move that would bring the total to an even number, and equally opposes any move that would result in an odd-numbered total. (e.g. When the total is 13, it's content picking a '2', '4', or '6' for its turn, even though playing a '1' would be the best choice.) Any ideas what I'm doing wrong? Here's the core of my code, for anyone brave enough to take a look:


      // When copying arrays/objects in JavaScript, the new variable actually 
      // just stores a *reference* to the original object/array.
      // In order to create a separate array/object, recreate one using this 
      // function instead.
      //     Usage:  var object2 = object1.clone();
      Object.prototype.clone = function() {
        var newObj = (this instanceof Array) ? [] : {};
        for (i in this) {
          if (i == 'clone') continue;
          if (this && typeof this == "object") {
            newObj = this.clone();
          } else {
            newObj = this;
          }
        }
        return newObj;
      };

      // Game object
      function Game() {
        this.activePlayer = [1];   // [1]=player, -[1]=computer
        this.runningTotal = [0];   // Cumulative sum. Each move made adds to it.
        this.playerScore = [0];    //
        this.computerScore = [0];  // 
        this.movesMade = [0];      // A running count of the number of moves made
      }


// Global variable to track current game state
var currentGame = new Game();


      // Applies the given move to the given game state (node)
      // Awards points to a player if the move brings the runningTotal up to [21]
      function applyMove(node, move) {

        // Create a fresh copy of 'node' to ensure that its data won't
        // be impacted by actions made later in this function.
        var newNode = node.clone();  

        newNode.runningTotal += [1]*move;
        newNode.movesMade++;
        if (newNode.runningTotal==[21]) {
          if (newNode.activePlayer==[1]) newNode.playerScore += [2];
          if (newNode.activePlayer==-[1]) newNode.computerScore += [2];
        }
        newNode.activePlayer *= -[1];

        return newNode;
      }


      // Alpha-Beta pruning function
      function evaluate (node, alpha, beta, depth) {

        var validNextMoves = getValidNextMoves(node);
        var movesCount = validNextMoves.length;

        if (depth==[0] || movesCount==[0])     // If the node is a leaf,
          return getHeuristicValue(node);  // return the heuristic value of it.

        if (node.activePlayer==[1]) {      // If node is a maximizing node (human)
          for (i=[0]; i<movesCount; i++) { // For each child of node
            var move = validNextMoves;
            var child = applyMove(node, move);
            beta = Math.min(beta, evaluate(child, alpha, beta, depth-[1]));
            if (beta <= alpha) return alpha;
          }
          return beta;

        } else if (node.activePlayer==-[1]){ // If node is a minimizing node (CPU)
          for (i=[0]; i<movesCount; i++) {   // For each child of node
            var move = validNextMoves;
            var child = applyMove(node, move);
            alpha = Math.max(alpha, evaluate(child, alpha, beta, depth-[1]));
            if (beta <= alpha) return beta;
          }
          return alpha;
        }

      }


      // Checks values of all possible next moves for the current game
      // state (currentGame) and makes the move with the highest expected
      // heuristic value outcome.
      function getBestMove() {

        var validNextMoves = getValidNextMoves(currentGame);
        var movesCount = validNextMoves.length;
        if (movesCount==[0]) {
          alert("No valid moves found.");
          return;
        }
        var bestMove = validNextMoves[[0]];
        var bestValue = -[9999];

        for (var i=[0]; i<movesCount; i++) {

          var move = validNextMoves;
          var newNode = applyMove(currentGame, move);
          var alpha = -[999];
          var beta = [999];
          var depth = [10];
          var value = evaluate(newNode, alpha, beta, depth);

          if (value > bestValue) {
            bestValue = value;
            bestMove = move;
          }
        }

        return bestMove;
      }


      // Returns an int, based on the scores in the given game state (node).
      // The higher the int, the better the situation is for the human player. 
      function getHeuristicValue(node) {
        return node.playerScore - node.computerScore;
      }

I haven't gone through your code, but you may want to consider a different algorithm altogether. This is a variation of Nim, and building a program that always wins if a win is possible is very easy.

Let's make a table of current count and winning move (if there is one), starting with high numbers:

20 119 218 317 416 515 614 -13 112 211 310 4 9 5 8 6 7 - 6 1 5 2 4 3 3 4 2 5 1 6 0 -


Notice that deciding each entry in the table can be easily decided based on the entries above it.

So in this particular game, optimal play is very simple: If the count is a multiple of 7, you don't have a winning strategy, so play something random. If the count is not a multiple of 7, make it a multiple of 7.

Games in which the available moves don't depend on whose turn it is are called "impartial games", and there is a very well-developed theory to study them. They can often be analyzed by making a table like the one above. And if that fails, they can always be analyzed by using something called "nimbers".

If the point of your code was learning how to implement alpha-beta, ignore everything I just said (although it's good to know).
Advertisement
Thanks alvaro, but I'm actually planning to expand my code into a Cribbage game, so I'd really like to figure out the alpha-beta pruning route.

If it helps any, this seems to be how my AI values each move:

(new total of making the move) = (evaluate() function's value for this move's branch)
<=10 = 0
11 = -2
12 = 2
13 = -2
14 = 2
15 = -2
16 = 2
17 = -2
18 = 2
19 = -2
20 = 2
21 = -2
>=22 = 0

I'm puzzled by the alternating between value 2 and -2. It almost seems like I have a -1 multiplier in a recursively-called function.
I would start debugging by looking at the situation that is closes to the end of the game. You should be able to follow every step in the program and see if it's doing what you expected.

Alpha-beta is not applicable to games with hidden information (cards in a player's hand) or to games with more than two players, so there is little hope this is going to help you with cribbage.

Quote:
Alpha-beta is not applicable to games with hidden information (cards in a player's hand) or to games with more than two players, so there is little hope this is going to help you with cribbage.


Hmmm... I was hoping I could incorporate an estimated probability of each branch occurring into the mix (an opponents actual hand may be unknown, but there are still a finite number of cards he/she could have), but perhaps I'm misunderstanding how alpha-beta pruning is supposed to work. What AI method(s) do you suggest? For the record, I'm OK with the 2-player constraint.

As for my alpha-beta pruning code, I would still love to get it working, even if it's not applicable for my cribbage game. Perhaps it'd be easier to find a working &#106avascript alpha-beta implementation, and modify/simplify it as needed. Do you know where I could find one? I've seen a fair share of pseudocode as well as implementations in other programming languages, but I have yet to find anything beyond the standard evaluation function for &#106avascript.
Sorry, I know nearly nothing about &#106avascript.<br><br>I think the most promising path to implement engines for card games is some sort of Monte Carlo technique. I don't have any clear pointers to give you at this time, but there was <a href=http://www.mail-archive.com/computer-go@dvandva.org/msg00331.html>a discussion in the computer-go mailing list</a> recently about the game of Tichu, and you may find it interesting.

This topic is closed to new replies.

Advertisement