ANN-based Pong.
Hello, i`m developing a Pong-clone in which, one player is controlled by the computer using artificial neural network (with genetic algorithms). The only thing i can`t think is: Ok, i have designed my NN, i know the inputs, i know the ouputs, but how to i "test" each Genome. First of all. My NN, has 3 inputs (1: ball`s x value, 2: ball`s y value, 3: players y value). The 3 outputs determine whether the player will move up, stay put, or move down. Now, how should i test each genome? * Try one genome at a time, calculate each fitness and then restart the game to try the next one? * Try all the genomes in sequence in _one_ game? (and at the same time calculating the fitnesses) any thought or idea is welcomed :-) p.s.: i`m pretty sure that back-propagation is way more fitted than g.a. in this situation...
I don't see how back propagation could be applied to this problem. I'd be interested in hearing how you would do that.
I would take a different approach to computing fitness. I wold have each genome play a game against some number of other genomes chosen from the population randomly. If each NN plays 25 games, each against a randomly selected opponent, you could give it a fitness related to how many times it won. You could even work in the point difference at the end of the game to try and find those that won by a large margin. By playing against random opponents you're more likely to be able to find a generally applicable controller rather than one that memorizes a sequence of locations.
Also realize that this isn't a very good method of AI for this type of game. But, that being said, if you're doing this to learn about evolved controllers in games, it is an interesting experiment. I would be particularly interested in seeing your results.
-Kirk
I would take a different approach to computing fitness. I wold have each genome play a game against some number of other genomes chosen from the population randomly. If each NN plays 25 games, each against a randomly selected opponent, you could give it a fitness related to how many times it won. You could even work in the point difference at the end of the game to try and find those that won by a large margin. By playing against random opponents you're more likely to be able to find a generally applicable controller rather than one that memorizes a sequence of locations.
Also realize that this isn't a very good method of AI for this type of game. But, that being said, if you're doing this to learn about evolved controllers in games, it is an interesting experiment. I would be particularly interested in seeing your results.
-Kirk
Why have you chosen to use NNs here? It seems like value iteration would be a much more effective learning method.
Thanks for your quick answers.
@Sneftel: well, just for educational purposes. i wanted a game to apply some kind of nn. i chose pong :P. i `ll check the "value iteration"-thing you mentioned...
@kirld: hm, well, i was thinking that i could train the computer player by letting him play random games and providing (by calculating) the final or best position from where to intercept the ball. just a thought :-)
now, your idea about having genomes compete each other is pretty cool. i`ll give it a shot as an alternative version or something.
just as you said, it is for experimental purposes :P heh
if the results are satisfactory i`ll post them.
thanks guys
@Sneftel: well, just for educational purposes. i wanted a game to apply some kind of nn. i chose pong :P. i `ll check the "value iteration"-thing you mentioned...
@kirld: hm, well, i was thinking that i could train the computer player by letting him play random games and providing (by calculating) the final or best position from where to intercept the ball. just a thought :-)
now, your idea about having genomes compete each other is pretty cool. i`ll give it a shot as an alternative version or something.
just as you said, it is for experimental purposes :P heh
if the results are satisfactory i`ll post them.
thanks guys
Quote:
Original post by makism
@Sneftel: well, just for educational purposes.
In terms of "educational purposes", value iteration is far, far more important and useful for game AI than NNs. So is minimax. So are SVMs, for that matter. I'm gonna go waaaaay out on a limb and guess that you chose NNs because they sounded "brainy" and "awesome". That's fine, but you might also want to learn some useful stuff.
Quote:
Original post by Sneftel Quote:
Original post by makism
@Sneftel: well, just for educational purposes.
In terms of "educational purposes", value iteration is far, far more important and useful for game AI than NNs. So is minimax. So are SVMs, for that matter. I'm gonna go waaaaay out on a limb and guess that you chose NNs because they sounded "brainy" and "awesome". That's fine, but you might also want to learn some useful stuff.
is this some kind of joke? i really don`t care about game dev (nor game ai), as part of my graduation i have to make a dissertation (also called thesis), so i _chose_ neural networks... not game development...
so... this is for educational purposes.
geez...
Dude, no need to get hot under the collar. I'm merely pointing out that as machine learning techniques go, NNs are pretty weak, and not nearly as useful as more common techniques. People are drawn to them because they sound sexy, while ignoring other techniques that are more interesting and effective. That's the case whether you're doing game AI or machine learning in general.
Quote:
Original post by makism
... i was thinking that i could train the computer player by letting him play random games and providing (by calculating) the final or best position from where to intercept the ball. just a thought :-) ...
Yes, there are machine learning methods (including neural ones) which learn from a series of decisions without direct feedback on each individual decision: this is known as "reinforcement learning". The classic methods would be BOXES, by Michie and Chambers, and the Adaptive-Critic methods, by Sutton and Barto, but there are others. The blind optimization route is also possible, and has solved some complex problems, though my guess is that it is relatively inefficient.
-Will Dwinnell
Data Mining in MATLAB
i too do not see how b-p could be applied here unless you have already a procedural algorithm which lets the b-p train agains - as in, this is what i would have done so converge.
A genetic algorithm where you let the game play for a minute to assess fitness might work.
A genetic algorithm where you let the game play for a minute to assess fitness might work.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement