Lukious/Rewrite - Robo Wiki -= Collecting Robocode Knowledge =-

I'm re-deriving Lukious from post-rewrite Dookious. Simonton demanded I post some TC results by today, so here are the latest. I'll clean this up later.

TargetingChallenge2K7/Results (500 rounds):

Name Gun CC RMX SHA WS WOE Surf DM FT GG RMC WLO No Surf Total Comment

Lukious 1.19.10 DC/GF 68.03 74.49 64.95 69.13 69.21 69.16 87.11 77.76 82.63 77.79 81.97 81.45 75.31 3 seasons

Lukious 1.19.13 DC/GF 72.94 76.93 69.59 76.67 72.84 73.79 87.99 79.71 84.85 78.04 80.63 82.24 78.02 3 seasons

Lukious 1.19.15 DC/GF 69.82 76.40 71.37 75.43 69.87 72.58 87.66 78.94 83.03 76.44 79.56 81.13 76.85 3 seasons

Lukious 1.19.16 DC/GF 71.21 76.80 70.67 76.96 71.50 73.43 88.06 78.39 83.55 77.13 81.40 81.71 77.57 3 seasons

Lukious 1.19.20 DC/GF 70.95 78.45 72.45 77.76 71.60 74.23 86.83 78.06 85.96 77.66 79.59 81.62 77.93 3 seasons

Lukious 1.19.24 DC/GF 87.79 77.84 84.14 77.65 81.27 81.74 3 seasons

Lukious 1.19.32 DC/GF 88.25 78.40 84.78 77.47 80.30 81.84 8 seasons

TargetingChallenge2K7/ResultsFastLearning:

Name Gun CC RMX SHA WS WOE Surf DM FT GG RMC WLO No Surf Total Commentsteady

Lukious 1.19.10 DC/GF 67.87 75.65 61.07 71.89 68.87 69.07 87.47 78.77 80.88 78.29 81.64 81.41 75.24 40 seasons

Lukious 1.19.13 DC/GF 70.65 77.40 64.45 76.03 71.93 72.09 86.77 80.14 81.91 78.17 81.73 81.74 76.92 40 seasons

Lukious 1.19.20 DC/GF 69.49 76.99 68.53 75.58 70.63 72.24 86.63 78.52 82.71 77.86 79.41 81.03 76.63 40 seasons

Lukious 1.19.24 DC/GF 68.39 76.93 67.22 74.21 69.14 71.18 86.23 76.21 82.17 75.34 79.92 79.25 75.58 50 seasons

TargetingChallenge2K6/Results (500 rounds):

Name Gun BFly CC Chk Cig Cya DM FM Grb RMB Tig Total Comment

Lukious 1.19.10 DC/GF 99.76 67.58 94.57 87.73 77.83 93.31 89.75 87.44 90.99 83.54 87.25 4 seasons

Version Notes:

1.19.10: 22,000 scans, cluster size 50.
1.19.13: 40,000 scans, cluster size 50. Switched the abs-ticks-from-firetime to use a "Triweight" instead of "Triangle" distribution. (Basically, more heavily weighting scans closer to fire time. See [Kernel Density Estimators].) Misc tweaks to distance function.
1.19.15: Entropy-based dynamic weighting on DynamicClustering attributes. Keeps a singly segmented VisitCountStats style buffer for each attribute, solely for calculating entropy. Each attribute has 3 segments, evenly distributed across the domain. Attribute with max information gain is weighted 5, with others scaled linearly such that 0 info gain would be weighted 1. All 13 attributes enabled: lateral velocity, advancing velocity, distance, wall distance, wall distance reverse, accel, distance last 8 ticks, distance last 15 ticks, distance last 25 ticks, time since velocity change, time since zero velocity, time since max velocity, abs ticks from aiming time (2 offset from fire time).
1.19.16: Uses "good" segments instead of evenly distributed ones; not all attributes are split into the same number of segments. Max weight is 7 instead of 5.
1.19.20: Added two more attributes: turn rate and distance last 60 ticks. Misc tweaks to existing attributes. Dynamic weights now exactly (still linearly) proportional to the information gain - attribute with 0 info gain would be weighted 0. Back to the Triangle distribution for the fire time element of KernelDensity (i.e., increasing weights of non-firing waves) and one small tweak to the KernelDensity method.
1.19.24: Dropped 5 attributes: distance last 15 ticks, distance last 25 ticks, time since velocity changed, time since max velocity, and turn rate. Forgot I'd left this in there - using Cosinus distribution instead of Gaussian for the KernelDensity. (Not sure this matters much at all...) Dimension weights used by the gun are a running total of weights that are calculated once per shot, trying to take mutual information into account.
- For each attribute, find 50 closest scans among last 1,000, using only that attribute in the distance formula; calculate the entropy of the GuessFactors returned (sliced into a VisitCountStats bin); and find the info gain (from maximum entropy); this info gain is the initial weight for that attribute in this calculation.
- Then, for each combination of two attributes, again find 50 closest scans among last 1,000, using only these two attributes in the distancing function; find the entropy and info gain (infoGainCombo) of the resulting GuessFactors; multiply both attributes' weights by ((infoGain1 + infoGain2) / infoGainCombo). E.g., if infoGain1 == infoGain2 == infoGainCombo, then nothing is gained from using both attributes vs using either one (i.e., they are measuring the same thing), so their weights are cut in half. I have no idea if this should scale linearly like this, but this value is almost always between .5 and 1, so I think this is at least somewhat sound.
- Loop through the (temporary) weights on each element, find the max, and then divide them all by the max, for normalization. This is so that one calculation matters as much as the next one.
- Add this temporary weight to the actual weight for each element.
1.19.32: Turned off all weighting, to serve as a baseline comparison for 1.19.33. Custom scaling for many attributes - e.g., third root of AdvancingVelocity, the portions from 0-1 and 7-8 count double in LateralVelocity, and decelerating values for accel are halved. Returned to Gaussian distribution in KernelDensity. Adjusted the GuessFactor bandwidth in the KernelDensity calculation to try and account for an issue David made me aware of. Returned to Gaussian distribution for KernelDensity.

Some info on the above:

Based on Dookious, so inherits the nicely polished wave handling and VirtualGuns systems and all other non-stats stuff.
I've been toying with a "Main Gun" and an "AntiSurfer Gun", but I am dropping the latter for now, for a couple of reasons: first, non-surfers are far more important to focus on at first with a gun; second, a DC gun tends to do OK against surfers anyway (note that Shadow only has one gun).
1.19.10 is just Main Gun, with 22,000 scans and a cluster size of 50.
KernelDensity (among closest scans) has two elements:
- First dimension is the difference in GuessFactors, Gaussian distribution with a bandwidth of half a bot width (much like Chalk and old Lukious).
- Second dimension is what I call "absolute ticks to fire time", which is min(time since bullet fired, time until gun cooled), with (1 - (absTicks / 9)) (I guess this is called a "Triangle" distribution).
Scan distance formula is Euclidean (square root(sum of squares)) distance with LateralVelocity, AdvancingVelocity, distance, WallDistance (forward and reverse), distance last eight ticks, time since velocity change (divided by bullet time), and abs ticks to fire time. Right now, all with intuitive (static) weights. (No, I don't actually do the square root on all of them, only the closest ones once I find them. :P)

A couple key speed improvements I've learned along the way:

Don't have any sin's or cos's in your distance formula. =) Kinda obvious, but it's easy to forget what's behind that "getLateralVelocity()" method call.
I was inspired by a WaveSurfing related tip from Krabb and found another excellent improvement: pass the current "max distance" to your scan distancing method as you're iterating through your scans, calculate your highest weighted factors first in the distancing method, check if you've already passed the max distance a few times throughout and just cut off if you have. I was able to increase from ~25,000 to ~40,000 scans with a first pass at this trick.
Keeping the list of closest scans sorted when generating them is really good (learned from Corbos' Chalk, he may have learned from ABC's DCBot). It's still O(n = cluster size) to update your maximum distance among your closest scans, but it averages to probably (n/2) or less if you keep it sorted, I think. Note: upon reviewing things, this doesn't matter as much as I thought!

-- Voidious

What exactly do you mean by "keeping the list of closest scans sorted"? Closest in terms of match closeness, time, or what? I'm not seeing how any of those gives you a speedup, you still have to check all stored scans, right? --David Alves

Ok, my bad for being so terse. Yes, match closeness. And yes, you still have to check all the scans. It's just one element - the updating of maximum distance among the closest scans - that is sped up here. As you're iterating through the 20,000 (or whatever) total scans, you need to keep track of what the maximum distance in the closestScans collection is, so that you can check if the distance for the current scan (in the loop through the 20,000) is less than that maximum; if it is, you sub the new scan into the closestScans array and then update the new maxDistance. I used to just iterate through all 50 elements in closestScans to find the maximum again after I changed one out. Now, I keep the closestScans list sorted by distance; if a new scan has a distance < maxDist, I start from the end of closestScans, shuffle the existing scans up, and insert the new scan into the correct spot. So maxDistance is just the distance on the last element in closestScans.

So you have far fewer total iteretions in all those recalculate-maxDistance loops and a lot fewer data reads, but a lot more data writes (updating references in closestScans). I actually wouldn't be 100% sure off the top of my head which "should be" worse, but I was able to go from ~10,000 to ~25,000 scans without skipping turns after I made that change.

-- Voidious

Thanks for the update! I know that the non-surfer scores are way more important, but improving the surfer scores is so fun! I don't understand the whole Gaussian distribution thing, I may have to look into that, just to learn what it means. But a note ... keeping your cluster sorted could result in O(log n), if you used a tree instead of an array to store them. -- Simonton

Or better yet, a heap! -- Simonton
Good idea! I'll look into it. -- Voidious
For what it's worth, I greatly overestimated the value of optimizing that portion of the getClosestScans? method, anyway. =) -- Voidious

Added another nice speed optimization to my list of tips above, took me from 25k to 40k scans. -- Voidious

Added info for 1.19.20 and noted for 1.19.15 that I'd enabled a bunch of attributes that weren't previously. I think all these extra attributes might be a problem - some of them have mutual information, yet they are being weighted individually, as if they are all independent variables. Thus in 1.19.21, 5 of them are disabled. Also, the entropy based weights in 1.19.21 are not quite linear (Math.pow(x, 0.75), not quite square root either). Fingers crossed =) I know entropy is a widely used way of judging what we want to judge - the information gain of a specific attribute in a data set - but I'm starting to doubt it is really what I want to use to determine my dynamic weights. I'll stick with it a little longer... -- Voidious

Another thing I want to try, very similar to something Simonton has mentioned: instead of measuring total entropy of a segmented VisitCountStats buffer, take the entropy of a closest scans cluster found with just that attribute; then the weight for that attribute could be a RollingAverage of the information gain (measured vs maximum entropy or an unsegmented buffer) each time I do that. (You'd still have to cut the GuessFactors into bins to get the entropy.) It should be roughly as expensive as aiming to do this for all attributes - both would compute distance for each attribute across all scans - so I could just do it on a different tick. Even limiting it to like past 1,000 scans could work just as well. It would also be interesting to see if this really gives any different results from my current setup. -- Voidious
This would still have the issue with mutual information skewing the overall weights - e.g., dist last 8 ticks and time since zero velocity probably measure many of the same situations the same way, so weighting them each by individual entropy will net you more weighting than you want. But if you did limit this part to 500 or 1,000 scans (instead of 40,000), that might be enough time to do some fancy [mutual information] calculations in there, as well. At first glance, that seems much easier in this setup than in the VisitCountStats buffers setup, where you'd need every combination of two-segment buffers in order to track / calculate mutual information. Hmm! -- Voidious
- Umm, or would you? I may need to get out the pen and paper tonight to strengthen my grasp of entropy and such. =) -- Voidious
What attributes do you plan to segment across within these clusters? Or by "an unsegmented buffer" do you mean a buffer of all scans...? -- Simonton

I dunno if i understand DC-guns correctly, but what about segmenting your scans (or should i say clustering :) )? Searching over ~20k scans gives me a headache :/ You wrote something about sorting your scans above, just ignore this post if you meant exactly this... I think it could be done in the same way like we segment our GF-stats: Put every scan in a multidimensional array (with as many dimensions as you have attributes, a multidimensional array of ArrayLists? would do the job). If you want to find the closest scan to your current scan, just search in segments around the segment in which the current scan is located. That should save you alot of needless searching :D You could first calculate the distance of your current scan to the "middle" of each nearby segment (I hope you get what i mean) and than search in all segments whith a short distance. May be it is faster to store your scans in a tree structure, but I don't know much about woods :P --Krabb

Hmm ... this idea could be just what we need to make room for more serious analyses of the data (every firing tick)! -- Simonton
On second thought, let's see ... 26,000 scans, we'll say 5 dimensions ... 11,881,376,000,000,000,000,000 array positions, 46 bytes per scan ... 497,078,232,001.5 terabytes of memory. Maybe that's why we don't do that ... But we could do it in, like, 2 dimensions to perhaps cut off totally wrong scans ... -- Simonton
I think you got me wrong, we dont need one array positions for each scan in each dimension. The idear was to put similar scans in one segment. If you have three attributes (=3 dimensions): distance, speed and acceleration, and you devide ech dimension in 5 parts you would have 5^3= 125 array positions. Ech position is one ArrayList of scans (instead of your bin-array in a gf gun). You would store your scan attributes and your gf(or whatever you use for aiming) in each "scan" class. You need to calculate the distance of your new scan (which should be used for aiming) to each segment in order to skip all scans in segments which are far away. --Krabb
(Edit Conflict) I don't think that's quite right. From what I understand, it would be 5 dimensions ^ 4 slices each (or whatever you decide), like current VisitCountStats guns. But instead of an array of doubles, you'd have an array of scans. Then, to find the closest scans, you wouldn't search your whole backlog - just findthe segment that matches the current scan, then search that one and all the neighboring groups for the closest scans. You're assuming the closest scans will always be close in the segmentation, which is probably true. It might work really well! -- Voidious

kd-trees (http://en.wikipedia.org/wiki/Kd-tree)? --Corbos

mom :) -Krabb
wow. -- Simonton
Yes, but a simple array would probably be faster. As we don't want to sort the scans entirely. --Krabb
Wowie, that is quite a tree. Its like it was made for Clustering. --Chase-san
Krabb - you think arrays could do it faster than O(log n)? -- Simonton
Hmm, i don't know much about such things (for now) :) But inserting a new scan would be independent of n. I can't say much about searching, because it depends on the number of scans in each segment wich is dependent on the number of slices in each dimension. But i think it is faster! But you could use a tree for each array point (instead of the proposed ArrayList), wich should speed things up a bit. --Krabb
If you assume, that the scans are distributed steadly and that the closest scan is in the neighboring segments, you should get O(log(n/slices^dimension)*3^dimension) for the seach (if you use trees for each array point), wich should be faster :). But the closest scan is not necessarily in the 9 neighboring segments and the scans are not distributed steadly... --Krabb

Surprisingly, the methodology I use in 1.19.24 is not that much code or that ugly. It is about as intensive as aiming. With 10 attributes, I'm doing 54 getClosestScans (10 with one attribute, 44 with two), cluster size 50 and log size 1,000, and an equal number of entropy calculations. The results are nothing too impressive, though. My next test is using this system for the info gain but not doing all the mutual information stuff - this also lets you use much bigger log size for the initial information gain calculation. -- Voidious

Name	Gun	CC	RMX	SHA	WS	WOE	Surf	DM	FT	GG	RMC	WLO	No Surf	Total	Comment
Lukious 1.19.10	DC/GF	68.03	74.49	64.95	69.13	69.21	69.16	87.11	77.76	82.63	77.79	81.97	81.45	75.31	3 seasons
Lukious 1.19.13	DC/GF	72.94	76.93	69.59	76.67	72.84	73.79	87.99	79.71	84.85	78.04	80.63	82.24	78.02	3 seasons
Lukious 1.19.15	DC/GF	69.82	76.40	71.37	75.43	69.87	72.58	87.66	78.94	83.03	76.44	79.56	81.13	76.85	3 seasons
Lukious 1.19.16	DC/GF	71.21	76.80	70.67	76.96	71.50	73.43	88.06	78.39	83.55	77.13	81.40	81.71	77.57	3 seasons
Lukious 1.19.20	DC/GF	70.95	78.45	72.45	77.76	71.60	74.23	86.83	78.06	85.96	77.66	79.59	81.62	77.93	3 seasons
Lukious 1.19.24	DC/GF							87.79	77.84	84.14	77.65	81.27	81.74		3 seasons
Lukious 1.19.32	DC/GF							88.25	78.40	84.78	77.47	80.30	81.84		8 seasons