Rednaxela/NeuralPatternMatching - Robo Wiki -= Collecting Robocode Knowledge =-

Well, when working on my first bot, Mervin, I once thought of the idea of doing PatternMatching using a neural network. In particular using a variant of a [Hopfield Network] which is a type of neural network that's known to be rather good for extrapolating remembered patterns from incomplete or noisy data. My plan was to create a 'grid' of input nodes, representing time in one axis, and representing possible locations in another axis. I would then train the network every tick (or every time a wave of mine hit?), and then read the network by feeding it recent actions of the enemy in a read-only set of nodes and have it attempt to extrapolate the time nodes after. After making a partial implementation and testing it's speed however, I found that it would consume the robot's entire turn just to do a little processing of it.

More recently, when thinking of things I can work on in Gwynfor, I had the idea to try this "Neural Pattern Matching" again, but instead reduce the number of needed input nodes by making it a form of SymbolicPatternMatcher by feeding the network more symbolic data instead, which should use up fewer nodes and thus perform far faster. I'm thinking it would be effective to give at a given time instant the following nodes:

3 Acceleration values (-2.0, 0.0, 2.0)
3 Angular velocity values (min, 0, max)

I'm hoping those nodes would be sufficent due to the fact that the motion of many robots should be able to be approximated by those alone. Also, these acceleration and turning values should be easy for me to feed into same precictor as I used for CircularTargeting except now it would use this non-fixed acceleration and anguar velocity.

Anyone think this idea sucks? Anyone thinks this is an awesome idea? :-) -- Rednaxela

Sounds like an excellent idea to me. I'm not 100% clued up on NeuralNetworks? myself, but the SymbolicPatternMatcher part I can say will work well. What I'm wondering is how long your 'history', ie. the number of previous scans, you should feed your net, and how you would be able to get a Hopfield to give you a future output that you don't know yet (from what I understand of Hopfield nets they give the closest to the input that was already given) . WRT the predictor, I did the exact same thing, using a circular-targeting algorithm and just modifying the data each iteration, in Waylander, so the idea definitely works well enough =) -- Skilgannon

I'm also wondering how long a 'history' would be necessary. I'm currently thinking of making it along the lines of 100 ticks as I think that shouldn't be to bad to calculate, but I'm not sure if that will be long enough or not. Any comments on how long a history PatternMatchers? tend to need to be useful? Yes, you are correct that they give the closest input pattern to a 'remembered' pattern. I intend to adapt it for predicting future values, by inputting the whole history, 'scrolling' across the input nodes while learning (or maybe just when waves hit the enemy, I'm not sure yet), and when predicting, I put recent history into the "early" nodes of the network, and have it try to "remember" the "later" nodes based upon the previous ones. The algorithm of the Hopfield network, is a simple repeditive one that changes all nodes iteratively to attempt to make it closer and closer to something that was remembered, so I will adapt this to make the "already known" nodes have a fixed value as provided by the scans, and have the iterative process only change the 'unknown' nodes (the ones representing future values). Additionally, I hope I can reduce the number of iterations required, by putting previous 'guesses' about the future pattern that still aren't known points in time, as 'default' values for the unknown nodes. This effectively allows the pattern matcher to keep a continuous 'thought process' when predicting the pattern instead of starting from a blank page each time. One aspect of a Hopfield Network, is that when it is filled with more "memories" than it can store with 100% accuracy, they begin to blur together, but I'm hoping that they blur in ways "desirable" for this (i.e. give likely rough movement paths) rather than undesirably (i.e. absolute nonsense). Hopefully my blabber didn't seem entirely nonsensical =) -- Rednaxela

Yep, 100 ticks should be plenty. In fact, 35 ticks would probably be sufficient for the majority of rumble participants. Waylander rarely gets matches longer than 40. Another thing to consider is that robots change their patterns depending on their distance from you, so maybe 4 values representing the distance would also be useful? I understand the rest of you comment, but it's a 'learning' type understanding =). As I said before, my knowledge about Neural Networks is limited =). -- Skilgannon

Well, while behavior is distance dependent, I'm not sure if simply adding values for distance would work too great with the way I'm intending to use the Hopfield network. With a Hopfield network network, the output nodes are the same as the input nodes, so it would have redundant outputs used for nothing but feeding back into the input. Perhaps I could add some distance nodes, and simply ignore their output and use them for input only, but I'm not sure if that would provide better results than just having a separate network for each distance 'bin'. It probably wouldn't hurt to try it in the same network though, and wouldn't have the same learning speed decrease as if I had a separate network for each distance. Is distance segmenting really that useful though? In my guessfactor targeting I haven't found it to do very much until getting to the bin that tends to be associated with collisions. -- Rednaxela

Oh, just checking, you do know that acceleration is 1, and deceleration is 2, right? -- Skilgannon

Yes (I accounted for that in my predictor for my wavesurfing). This brings up an issue though. I either need to make nodes for 'acceleration' of -2, -1, 0, 1, and 2 (the robot could be moving either forwards or backwards), or I need to make another "angle" node that represents a 180 degree flip. I'm not sure which is a better way to go, or if there's a better way -- Rednaxela

I personally think the "angle" node is a better way, because it allows you to use only -2,0,1 for acceleration, and doubles your learning speed. It should also make you more effective against WaveSurfers, who look at the data exactly this way, and I think that if your stats start to 'blur' together, they will 'blur' better if you implement it this way. -- Skilgannon

Well a subtler detail, but actually I think a better way to describe it than an "angle" node, would be a "direction" node which represents the current direction it is moving. Hmm... Gwynfor is now getting to the limits of where it's about to start skipping turns, since I added the WaveSurfing... Either I'll have to remove things, optimize things, or start a new bot, if I want to do try pattern matching idea out, as it won't be computationally cheap itself. I suppose that I could make a test variant to test this gun with other guns removed. -- Rednaxela

On second thought, I think it would be better to go with the -2,-1,0,1,2 acceleration scheme, but you multiply all accelerations in the history by the sign of the current velocity before you feed them to the network. This means it would learn clockwise patterns as fast as anticlockwise patterns. Another thing I'm wondering, due to my limited NN experience, is whether it is possible to weight some inputs so that they are more important than others. This is because the most recent accel/headingchange entry is much more relevant than the one 35 ticks ago. -- Skilgannon

Wait, wouldn't multiplying the sign of the current velocity be exactly what one does to only get -2, 0, and 1? And yes, it should probably be possible to weight newer inputs more important that older inputs, but keep in mind that that gradient shouldn't be too steep that it would be forced to spend alot of effort un-learning that gradient. Note, I'm pretty sure that it would learn on it's own that more recent values are more relevant (after all, a Hopfield largely is about learning simple correlations between combinations (and thus patterns) of values in nodes), but it would probably be possible to have a default gradient that would save it some of the trouble. -- Rednaxela

In a sense, yes, but that is only for the current scan. They may have changed direction 4 ticks ago, so the deceleration that preceded the direction change would have been considered as positive 2, and the current acceleration as the robot increases to full speed is considered positive 1, which makes sense since the 'acceleration' is in the same direction. Say before that they had accelerated, that would be considered negative, because the direction of their acceleration is opposite to the other direction. Thus, you can get both -2, 0, 1, as well as -1, 0, 2, which gives you the full range. -- Skilgannon

Well... for that to work at all, wouldn't I need to invert the acceleration depending on a value in the middle of the history in order to keep it in reference to the same node as during real usage? I also think doing it that way would really throw things off because all of the acceleration values would invert every time the current direction changes. I don't really see why all acceleration values should be multiplied by the sign of the current velocity, as opposed to by the sign of the velocity at the same instant that the acceleration is referring to. -- Rednaxela

The reason I say this is so that your algorithm works just as well going forward as going backwards, and also so that it can differentiate between "same as the current direction" versus "different from the current direction". This makes it as effective as a guessfactor gun as far as learning speeds. -- Skilgannon

Hmm, I think I understand what you mean. In any case, it would be trivially easy to reconfigure how the nodes on the network are set up, so it shouldn't be hard to experiment with these things types of things too. Hopefully I can manage to optimize Gwynfor's CPU usage enough that I'd be able to fit this gun in. I suppose I'll worry about that after I get Gwynfor's WaveSurfing working well enough. -- Rednaxela

Well today I thought about this idea more, and I think I now have a good idea about how to make this also segment by different factors effectively. First, one thought I had is, perhaps for angle, the nodes instead of representing clockwise/counterclockwise, should represent "turning towards/away from me", and then instead of only having magnitudes of 0 or the maximum turn, it would have magnitudes of, max turn, perpendicular, and 0. About segmenting, I now think it should probably work fine to make things like distance also input nodes, and just don't have distance nodes in the 'portion' of the net that predicts forward. With the way the Hopfield network works, it basically just looks for which nodes being turned on, tends to be associated with other nodes being turned on, and finds best matches, and I'm now feeling more confident than before that segmentation can be handled internally in the network, and it should work fine to learning tendencies of how various segments affect the patterns, even if it's "memory" overflows the maximum at which it has perfect recall. I suppose the limit on the amount it can do perfect-recall with, is somewhat of an advantage compared to normal pattern matching, in that it's speed never decreases with increased number of patterns, and as the number of recorded patterns exceeds the maximum it can do perfect-recall of, it should take on a statistical-like element. In some ways, the end result will be somewhat of a hybrid of log-based and statistical-based workings, that I honestly think should be able to achieve the best of both worlds. Of course, isn't that what everyone says of their targeting ideas? :) Anyways, I'm feeling quite hopeful of this idea now, and will probably implement this in the rewritten Gwynfor before I try any GuessFactor stuff again. My only fear is that this pattern matching neural network may be too computationally intensive. I suppose I should research what I can about optimizing java as much as I can. -- Rednaxela

I've mentioned this before, but an easy way to optimize your stuff is to avoid unnecessary calls to generalized methods. For example, write your own sqr(x){return x*x;} function, instead of using Math.pow(x,2), and always use Math.sqrt(x) instead of Math.pow(x,0.5). Don't pass more parameters than you need to for your own internal methods, and try to inline instead of doing recursion. Avoid calls to Math.sin,cos,tan,asin,acos,atan,atan2 as much as you can. They are very slow. Consider the order you multiply variables in: For example, if you have 3 doubles, x,y,and z, and z is frequently zero, do any multiplications as z*x*y, so that the second multiplication goes quicker. If you need to use the same distance more than once, cache it. Doing a distance calculation involves doing a sqrt(), which is slow. Cache calls to sin,cos,tan if you're going to use them more than once. These are all things that I use in DrussGT =) -- Skilgannon

Actually, one trick I did in what I've started so far in the rewrite, is make my RelativePoint? class, possible to generate from either x/y coords, or a direction and a magnitude, and made it so that it lazily calculates the other form of the vector as requested by other objects, and caches it (as the sin/cos etc are indeed a little more expensive). As far as this NeuralPatternMatching? goes though, my test code I made way back when, well, I made a test for the neural net of it, and at a "history size" of 100, and with 6 nodes per tick, it would take about half my robot's turn to process, which I'd call far too much, particularly considering I'd like to have more nodes per tick than that. Issue is, I'm not sure where I can optimize, because I already have have it in a tight array, doing nothing but a couple elementary operations on a large number of floats. Well here's the "base" of the Hopfield algorithm that I'm having trouble optimizing:

 public class HopfieldNetwork {
    private final float[] lastvalues;
    private final float[] values;
    private final float[][] weights;
    private final int size;
    private static final float learnrate = 0.2f;
    
    public HopfieldNetwork(int size) {
        this.size = size;
        values = new float[size];
        lastvalues = new float[size];
        weights = new float[size][size];
    }
    
    public void iterate() {
        /* Copy old values */
        System.arraycopy(values, 0, lastvalues, 0, values.length);
        
        /* Run an iteration */
        for (int x = 0; x < size; x++) {
            float valuesX = 0;
            for (int y = x; y-- > 0;) 
                    valuesX += lastvalues[y] * weights[x][y];
            for (int y = size; --y > x;) 
                    valuesX += lastvalues[y] * weights[x][y];

            values[x] = (1/(1+(float)Math.exp(-12*valuesX+6)));
        }
    }
    
    public void train(float[] pattern) {
        if (pattern.length != size)
            throw new IndexOutOfBoundsException();
        
        for (int x = size; x-- > 0;) {
            float iterationWeightScale = pattern[x]*learnrate;
            for (int y = size; --y > x;) {
                float dweight = iterationWeightScale*pattern[y];
                weights[x][y] += dweight;
                weights[y][x] += dweight;
            }
        }
    }
    
    public float[] getValues() {
        return values;
    }
 }

About the only thing I can think of is to use a single dimension array and manually calculate the index for the array of weights (due to java making multi-dimension arrays, as arrays of array objects), and I'm not sure that would be enough of an improvement to be worth it. -- Rednaxela

I made a couple changes...it should be a bit faster, I think. What is a typical value for size? And was there any reason dweight was final? -- Skilgannon

Awesome, that optimization in iterate() made it nearly twice as fast, that's a clever one. As far as train() goes, I already had it changed to a float here and forgot to update the code paste here, but that did have a slight performance increase. Well, dweight was final, because it was only being assigned once. Realistically it makes no difference in this cases, but I've made a habit of declaring things that are not reassigned as final, which in some cases can be used as a safety check, and in other cases can theoretically allow the compiler/JVM to make slight optimizations due to knowing that not event crazy reflection could modify it. In this case, I tested and there's no affect on performance either way, but I'm trying to keep in a habit of declaring things final when I can anyways. Anyways, size is equal to the number of nodes I want in the network. If I make it use 25 ticks of history, and predict 25 ticks ahead, and use 6 node per tick, the value of size is 300, but even that takes about a third of the time available to my robot's turn (and note that's not counting the time for the "circular predictor" code to run off of the neural network), and the time taken increases very rapidly as size increases. Right now I'm doing research into sparsely connected (only a limited number of connections, instead of all connected) Hopfield networks, which may be able to increase performance due to the number of weights being far far smaller. -- Rednaxela

I changed the direction of the loops in train(), I'm not sure if that helped (maybe the external one did), but what will (definitely) help is not repeating the 1)access of pattern[x] and 2) its multiplication by learnrate 299 unnecessary times =). I've also tried caching the array in iterate(), I'm not sure if that will help enough to pay off the variable assignment etc., but it should remove one extra layer of abstraction. (edit) And I've cached the one array in train(), as well as values[x] in iterate. -- Skilgannon

Well, the optimization in iterate() in your first edit has no measurable impact I can see, but the one in train() does help by approximately 10% based on a test I just did. Your second edit, had no impact on train(), and slowed iterate() by something like 5%. Better than nothing, but I think I'll still need to do more research into sparsely connected networks. -- Rednaxela

I'm surprised, I though replacing 300 calls to values[x] with 300 calls to valuesX plus 1 float assignment would be much faster. Hang on...reading the code a few lines above, I've made one more change...this should help. -- Skilgannon

My guess is that the JVM does have a slight idea about how to do some array optimizations itself. That new change may have a slight impact but not a large enough one for me to measure. -- Rednaxela

Ok, a final optimization was to cast Math.exp() to a float before doing any more calculations with it. And I've reverted to what I think is the fastest combination. -- Skilgannon

Ahh yeah, that seems to work pretty well. I'll probably still need to investigate ways to modify the algorithm such as making it a partially connected instead of fully connected but this help is greatly appreciated. The discussion about the CPU constant in Robocode/News is also encouraging that this may not be too CPU intensive after all. Thanks -- Rednaxela

Update: Well, with the consensus seeming to be leaning towards 1.4.x style CPU constants, it seems that I should have enough available CPU to not need to look into partially connected networks so much. Thanks for you help in optimizing Skilgannon. That should still be helpful, because after all, the more optimized the code is, the larger I can make the net, and thus the smarter it can be. Looks like this will be becoming a reality as soon as I have the basic framework for Gwynfor V2 complete (It's taking a while to make the new framework but hopefully it will speed of development of later components). -- Rednaxela

Another thought so I don't forget it: It wouldn't add too much overhead to add "guessfactor" outputs to the neural net as well. Then it would be a single neural net, that would power two targeting methods, a pattern matching method, and a GuessFactor method. The GuessFactor method would interestingly gain inherent segmentation on the the movement pattern, and the PatternMatcher would gain inherent segmentation on the estimated likelyhood of different guessfactors. Which method is actually used could be decided by a simple MultipleChoice setup, but the data/"Thought" sharing between the two I believe would benefit both prediction modes. That's if things work as I hope in theory anyways. I'll certainly have to try this out... -- Rednaxela