Robo Home | GeneticProgramming | Changes | Preferences | AllPages

I explored four different conditions for training my robots. They are presented here in order of difficulty, beginning with the easiest condition.

One Adversary, Fixed Starting Position

The simplest possible experiment condition was training against a single adversary where all battles begin from a fixed starting position. Under this condition, I was able to reliably evolve robots that could beat each of the bots that ship with Robocode, as well as SquigBot. In some cases, only a few generations were required. Using a population of 500 individuals, I evolved a robot that could beat Walls in only 13 generations.

There are at two reasons that this condition was so favorable for my approach. First, evaluation was very fast, since only one battle was required to evaluate an individual. In practical terms, that meant that even with a population of 500 individuals, an entire run could be completed in a few hours. More importantly, however, this condition allowed robots to develop very brittle strategies that only worked under the precise training conditions. In general, these robots were unable to beat adversaries other than one that they trained against. The ability of these robots to generalize beyond their initial starting positions was mixed. Some runs produced robots that could win fairly reliably even at novel starting positions, but other runs produced extremely narrow solutions that only won at a single starting point.

One Adversary, Multiple Starting Positions

Combat outcomes are very sensitive to the initial starting position. Even an inferior robot can win easily if it starts the battle with its radar and gun fixed on the back of an opponent who is trapped in a corner. Thus, to produce a fair evaluation at multiple starting positions, a reasonable sample size is necessary. My sample sizes ranged from 10 to 25 starting positions; this had the obvious effect of increasing the training time by a large factor. To fairly evaluate all individuals in the same generation, I used the same set of random starting position for each individual within a generation. The starting positions were updated every generation to prevent overfitting.

I chose several adversaries from the ``starter'' set, and in all cases I was able to evolve a robot that could win more than half the time. For example, it took 60 generation to develop a robot capable of regularly beating SpinBot 80% of the time. Against SquigBot, another 60 generation run produced a robot that could win 50% of the time, but it appeared to plateau at this result. In most cases, when the evolved robots won, the margin was small, whereas when the hand-coded robots won, the margin was very large. In part this was due to the fact that most of these experiments were conducted when I was using a raw fitness function that actually encouraged this outcome by offering a declining rate of return for high scores. Changing the fitness function to more greatly reward the margin of victory improved the situation somewhat. However, to a large extent this problem was due to the fact most of my robots never even attempted to shoot their adversary. Instead, they won by dodging all fire until the opponent ran out of energy.

Multiple Adversaries, Single Starting Position

The robots evolved in the previous condition typically learned to move in a pattern that evaded their adversary's firing pattern. Predictably, these robots had no success against other adversaries. In this condition, I attempted to evolve robots that learned general programs that could defeat multiple adversaries.

After 80 generations, a robot evolved that could beat four out of five of the initial bots that ship with Robocode. The only bot that it couldn't beat was Tracker. My robot had evolved a complex movement pattern that avoided the fire of the other four adversaries, who would shoot from a distance. But the tracker inevitably chased my robot down and killed it. In another run, I trained my robot against 10 adversaries, including SquigBot. In 51 generations, a robot evolved that could beat four of the ten adversaries, through a combination of dodging and ramming.

Multiple Adversaries, Multiple Starting Positions

This condition was designed to produce the most general solutions: Robots that could beat multiple adversaries from a variety of different starting positions. Unfortunately, this condition was extremely time-consuming. I recently began an experiment with two adveraries, twenty random starting positions, and a population of 200. Under these conditions, each generation took roughly 130 minutes. After 14 generations, very little progress has been made. In the previous two conditions, 60 to 80 generations were required to produce winning robots. Since this condition is more general than either of the previous two, it would seem that 60 generations -- 130 hours -- would be soonest that any success could be expected.
This is so fascinating! 130 hours. I think that's how long it took me to hand code a bot that could beat all SampleBots. =) There's this thing with the SampleBots that they all shoot with HeadOnTargeting. Maybe that's why you reach good results in so few generations (at least few from what I would have guessed). Have you given any thought to using GeneticProgramming for very specific tasks? Like maybe for correcting LinearTargeting or some such. Also I think you could have some success with using GA to create a good randomized flattening movement. Since I'm not a schoolar I think it might be a too daunting task for me to try this out, but I'm tempted to check what I could do with that TABLEREX language... -- PEZ

Any chance of you publishing the code for the winning bot? I would be interested to see what kind of code can come out of a system like this. -- Tango

I would assume the code looks pretty similar for all these bots, whether evolved or not. It's just the genome string fed to the TABLEREX interpreter that changes. -- PEZ

That's right -- the java is identical, only the TABLEREX differs. I looked closely at some of the TABLREX and it was a little disappointing. A lot of lines that never connected to the "actuators", and then stuff like "turn left (my energy)"... -- Jacob

Robo Home | GeneticProgramming | Changes | Preferences | AllPages
Edit text of this page | View other revisions
Last edited June 11, 2003 19:37 EST by Paul Evans (diff)