Advertisement

Evolving PacMan AI

Started by January 29, 2008 05:08 PM
37 comments, last by kirkd 16 years, 8 months ago
I've seen similar things happen. One problem that I've noticed is that given the random nature of the ghosts a particular network can receive a very high score simply due to the fact that on one run through the maze all the ghosts were in just the right place at the right time for PM to get a very high score. Most of the subsequent runs have much poorer scores, but due to the one good run, that network gets an excessively high score. This usually causes the population to converge very early to a less interesting neural net.

To combat this I put in the concept of Replicates. Each network is run multiple times and you have the option to take the average, median, or minimum score that particular network achieved over all its runs. I've found that if I set the replicates to 7 and take the minimum (you can adjust these in the parameters file), this usually helps take care of the problem of a fortuitously high score.

So far in the presence of ghosts the results are less than spectacular. I think the concepts are being learned, but having ghosts present presents a very difficult problem. I still question whether the inputs are appropriate. In the most recent link sent by owl the intention was to try and replicate what a human player would do in a certain situation. For every step in the game with a human player, they generated a series of integers describing the state of the game and how the human player moved. After collecting a large amount of this data, they used it in more of a data mining sense using a decision tree to determine how PM should move given the current game state. Very interesting idea, but again I was hoping for a situation with raw game state in giving PM control out.

As I mentioned before, my next experiments will be to install a vector based approach. A series of vectors centered on PM and providing the angle and distance to ghosts and power pellets will be used. I'm not certain of what to do with dots and walls just yet, but I may leave them in their current windowed version.

Finally, I'll try to update the SourceForge page again this week. I've made a couple of modifications to the code that speed it up and fix some additional, small bugs.

-Kirk

You also gotta check the related works link. Very specially this paper, that guy got some interesting outputs there.

Another data you could consider is the shortest path's distance between the pacman and the ghost, to give pacman the chance to try to maximize/minimize it accordingly.
[size="2"]I like the Walrus best.
Advertisement
I scanned through the thesis briefly and it is certainly a unique approach. It seems that they modeled the maze space as a series of turns and look for an assessment of what to do at each turn given the current ghosts states, etc. I also notice that they have only 1 ghost. I will run some tests with my code using only 1 ghost and see what my relative score is compared to their's.

EDIT: I also notice that they do not seem to have any power pellets. It appears that the ghost represents an additional constraint on PM's actions, but does not contribute to his scoring ability.

[Edited by - kirkd on February 12, 2008 1:51:21 PM]
Quote: Original post by kirkd
To combat this I put in the concept of Replicates. Each network is run multiple times and you have the option to take the average, median, or minimum score that particular network achieved over all its runs. I've found that if I set the replicates to 7 and take the minimum (you can adjust these in the parameters file), this usually helps take care of the problem of a fortuitously high score.


I was actually running with 2 or 4 (can't remember which) replications :) Must've been a serious fluke some time during the night.
I used to always run with 5 replicates and found that an excessively high score can occur 5 times in a row with some regularity. Needless to say I was quite surprised. It seems that 7 works reasonably well, albeit slowing down the process quite a lot.

I've also tried to make the ghosts more aggressive to try and cut down on accidental collisions with PM while they're blue:

I increased the rate at which they move toward/away from PM when hunting/blue. It used to be 20%, 40%, 60%, and 80%, but I've changed that to 50%, 60%, 70%, 80%.

I allow the ghosts to change directions immediately after PM eats a power pellet whereas before they could only change direction if they were at an intersection.

These seem to have helped somewhat. I'll try to get the latest code up on SourceForge this afternoon.

-Kirk
Updated code is up. Here's the detail:

v0.1.3 Updates
--------------

Windows now update during the evolution step thus preventing significant freezing of the application during one epoch.

The console window now tells you whether PacMan will run or not after the current epoch and has (almost) instant feedback from pressing 'B'. The title bar of the console window (PacMan maze) states "Active" or "Sleeping" depending on the current state of the application.

Small optimizations to speed up the simulation.

Ghosts now allowed to reevaluate their direction after PacMan eats PowerPellet, and ghosts made more aggressive. Hopefully this will discourage chance collisions forcing PacMan to be more active in pursuit.

When the console window was running a spontaneous crash would occur the seemed to stem from the Windows message queue. A DoEvents() function was implemented to allow window events to be processed while the Console was active. This seems to have solved the problem.
Advertisement
with a score of only 520 it plays a lot better now. I notice that sometimes the simulation timeouts, is that because pacman gets stuck? it stops when pacman reaches the highest fitnes value, why don't you let it run until he dies?

[Edited by - owl on February 12, 2008 4:26:20 PM]
[size="2"]I like the Walrus best.
Hmmmm....I'm not sure if I know what you mean. Currently it times out when he gets stuck in place or when he gets stuck in a non-scoring cycle. You can adjust the amount of time allowed for both of these in the parameters file:


MaximumUpdates 1000
MaximumStall 25
ScoreStall 25


MaximumUpdates is the maximum number of "steps" the simulation will take. I set this to 1000 after playing it manually a few times to determine how many updates would be required for me to clear the level. Usually I took about 600 (I think), so I set this to 1000. The primary reason for this limit is so that the overall simulation doesn't take quite so long to run while evaluating each net during the GA.

MaximumStall is a limit on the number of updates in which PM's position does not change. The number is arbitrary.

ScoreStall is a limit on the number of updates in which PM's score does not change. Again, this one is arbitrary.

These limits also define when the visualization itself stops. Possibly what you're seeing is a stall (stuck in a corner with no change in position) that is fairly short - 25 steps - so it appears to stop rather immediately. Try increasing the value for ScoreStall and MaximumStall to 100. Most likely he'll sit in the corner for a while before the visualization ends.

-Kirk

Hi I am very interested with this field of work as I at the moment am trying to prove that the problem of pacman can be solved with a rule based system prioritizing the pacmans goals and acting accordingly.

Here is the source, its a totally different approach but my tutor is eager to prove that this practice could compete to a suitable level;

http://devweb2007.cis.strath.ac.uk/~lmcmilla/pacmansol2%20(4).zip
Hi I am very interested with this field of work as I at the moment am trying to prove that the problem of pacman can be solved with a rule based system prioritizing the pacmans goals and acting accordingly.

Here is the source, its a totally different approach but my tutor is eager to prove that this practice could compete to a suitable level;

http://devweb2007.cis.strath.ac.uk/~lmcmilla/pacmansol2%20(4).zip

This topic is closed to new replies.

Advertisement