ABC/TemporaryRRServer - Robo Wiki -= Collecting Robocode Knowledge =-

Ok, I can't take it anymore, life without a working RR server is too boring... :)

I've installed a rumble server on my home computer and have been running battles for the last couple of days. I also did a few experiments with the code, to better understand how it works. You can check the current rankings here:

http://abchome.aclsi.pt:8080/

Looks quite good even with only around 20 battles per bot. If you want to contribute just change you client's configuration accordingly, let's see how long it takes to get a truly stable ELO ranking.

I'm planning on adding an APS ranking next weekend, and then maybe put it on dedicated hardware.

-- ABC

Awesome, you're the man!! -- Voidious

Nice! Any plans to put melee or maybe team up there (though team is still working fine on Pulsar's server for now I think)? Hmm, I wonder how temporary this will be considering how thing were last time someone talked of a "temporary" RR server... :) (Also, I just noticed RougeDC got an extremely unusual result against step.nanoPri_1.0 though I'm not sure if it's a bad client problem, something eating lots of the client's CPU, or if RougeDC somehow went and crashed really bad) -- Rednaxela

Good catch, very strange result. So far I'm the only one submitting results, and both my clients are 1.5.4, so no "bad" client. Let's wait and see if it is an isolated case, I can always delete that result and recompute de the ranking. I made a small utility that re-submits all the battles in the log, but it's only feasible while the number of battles is relatively small. I haven't come up with an easy way of undoing a bad result, the iterative nature of the rumble makes it hard. The only way I see is having periodical backups to revert to and re-submitting the good results since then... -- ABC

Ah great work man! I think I'll update DrussGT, and leave a client running =) -- Skilgannon

Finally! You truly are the man! You know, an APS ranking wouldn't have the same trouble with deleting bad results ... ;) -- Simonton

Was just thinking that myself about APS ranking... ;) -- Rednaxela
I don't see how APS would be different, the problem is the rolling average used for the scoring. How do you "unroll" a bad result without checking every single previous battle or knowing the previous score before the bad result? Looking for all the battles between 2 specific bots can become hard when the battles log file becomes too large. -- ABC
The APS score I'm thinking of would not be a rolling average. Keep "total points" and "num battles" for each pairing, then to remove a bad one subtract its score and 1, respectively. Calculate the change in the average for that pairing and apply 1/num-pairings-th of it to the overall average. -- Simonton
Ok, that's a solution for the ELO too, the next few battles would put the affected bot in its proper place again, or I could force a rating recalc with an increased constant. The only difference to the current setup is that data saving bots would lose the small advantage of the rumble "forgetting" old results, I'd be ok with that. -- ABC

This may take a while to get a workable field, with this many bots. For each bot to have a battle against every other, it is n + (n-1) + (n-2) +... + 1 = n/2*(n+1) battles, which makes 205761 battles. For each bot to have 2000 battles it is (2000/641)X more battles, 642000 battles. Each battle produces the results for 2 bots, therefore 321000 run battles. Assuming the average battle takes 1 minute, we have 5350 client-hours of work. 223 client-days. If we're running (best case) 4 clients 24/7 it will take 55 days. If we just want 1 battle per enemy it will still take ~18 days, but realistically more than that because some battles will happen twice. I suggest we lower the battles-per-bot constant to ~650 in the roborumble client so that it will fill in the PL gaps earlier, rather than running duplicates while PL slots are still missing. -- Skilgannon

I've used a battles-per-bot setting of 10, then increased it to 20, and will will now set it to 50. I'm hoping it will prevent a lot duplicates that way. 18 days for full pairing would be great! But I believe the (ELO) ranking will be "stable" before that, like with 100-200 battles per bot. -- ABC

Sorry - I missed changing one of the URLs in my properties files. My clients are the culprits of giving over 200 battles to those handful of bots. I wish it was a harmless mistake, but now those bots' rankings may not stabilize until everyone else catches up and they start running again unless someone (ABC) intervenes by re-calculating their scores after the other bots have settled into a more stable state. -- Simonton

No problem, until everyone has full pairings the probability of anyone fighting is the same, I think. I'm more concerned about the bad results still happening. I just found two more: stelo.UnfoolableNano? against Dookious and DrussGT. Both battles submitted by Nfwu. Could there be other problematic Robocode versions? -- ABC

Version 1.6.1.1. Stopped client for now. Should I switch to 1.5.4? --Nfwu

Yep, I believe 1.5.4 and 1.6.0 are the only safe versions. But first I need to find an easy way to revert the bad results, please use a different name for submitting battles for now, I will probably have to delete all battles with your user name (nfwu). -- ABC

UnfoolableNano? vs. Ascendant, not Dookious, I think? How about just removing these pairings from the results files of these bots? -- Skilgannon

Yes, and yes. But I want to make it at least semi-automatic. I'm also considering refusing results with >50 PBIs. -- ABC

Looking for all the bad results to remove the pairings was too much work. I had to reset the rankings, remove all of Nfwu's battles from the log and resubmit all the battles (multiple times, it's fun seeing the ranking stabilise even with a small number battles :)). I also did a backup of the ranking and details files and cleaned the battle logs. Maybe I'll set an automatic periodical backup up so that I can revert it back a few days in the case of corruption. -- ABC

How "temporary" do you intend this server to be? For a long-term solution, I would recommend storing all results in a database rather than a log, so that queries can find all kinds of fun things much easier than looking through the log manually, or writing a new script to extract data you want (from bad results to new scoring schemes ... maybe a simulated bracket tourney if you like). I plan to have the RoboResearch server keep all the results that pass through it, including those bound for the rumble, so maybe that database could be a good place to turn in the future. I figured I'd let you know, so you can plan the best use of your time. -- Simonton

Excellent question! With my second child coming at the end of this month I don't think I'll have a lot of time in the next few months(years?) :). I just thought I would have some fun with java application servers instead of tweaking Shadow without a rumble to test it on. Anyway, if you are developing a battle results database, that's where the future rumble must come from, not from my (very) modest java servlet skills. I'll just keep this one running until someone comes up with a better solution. And in the meantime I'll try adding some simple tweaks of my own, the priority being always that it stays functional and online for people to use. -- ABC

I've been toying with the idea of creating a database-backed rumble server to address some of the recent problems. It wouldn't be difficult to extend the client to send additional info (e.g. client version, match type) to add more error checking too. But this all relies on free time...

Thanks for getting this temporary server up -- now that I've fixed my rumble machine, I'll start helping out with the crunching and remove Gaff 1.28a from the list. -- Darkcanuck

I just found one of my clients stuck: it would run 30 battles of Okami vs other bots, then try uploading the same 30ish results from some previous run, hit an exception, and repeat. So some battles got submitted many many times. I deleted all the files in "files/" and "temp/", and now it seems to be working again. Not sure if you'll want to roll back results? I'm using version 1.6.0. -- Simonton

As long as the results are valid I see no need to revert them. I'm really liking the look of the ranking, btw, even with under 100 battles per bot the numbers and positions are very close to what I remember from when the old server was 100%. I'll add an APS sorted ranking table later today. -- ABC

This happens when you have a low number for battles_per_bot (e.g. 100). When all bots have that number, the priority battles shift to filling up the missing pairings, and then it frequently happens that the server decides to fill it up for one bot first. Normally you wont notice it, as with 2000 battles only 15 or so pairings are missing. Now around 500 pairings are missing :D. Just set your battles_per_bot to 750, and the battles will be distributed evenly again. -- GrubbmGait

As everybody seems to run battles here at the moment, those 'corrupted file' bots could be added again to the participants I think. -- GrubbmGait

Or maybe we could make an alternative participants page to include them. APS ranking: add &type=APS to the rankings URL.

Looking at the latest Robocode version changelog I think it would be safe to use for rumble. All those 'funny' results could definitely have been from a 'swapping' of scores of the 2 bots. Opinions? -- Skilgannon

Yes, the sooner we test it the better. Please use a different user name for submitting results. GrubbmGait is currently submitting results as [GrubbmGait 160]?, a very nice way to differentiate client versions. -- ABC

Great idea -- done! I've been working on my database-backed server idea the past few days, uploading is working and just need to put together some polished ranking pages. No ELO yet though. The concept is to keep the originally submitted data so that it's fairly easy to remove bad results and rebuild the rankings. The current rumble client doesn't send its version number (just "1") but future clients could send the actual version and we could put new releases on probation. I'll put up a page about it when the server goes live. Questions/suggestions welcome! -- Darkcanuck

My experimental server is now live. But keep your clients pointed to ABC's server unless you really, really like hunting bugs. --Darkcanuck

Nice work, good luck with the bug hunting. Meanwhile I added a cool javascript LRP generator to my server, check it out if you have a decent (non-IE) browser. :) -- ABC