[Home]Roborumble.org

Robo Home | Changes | Preferences | AllPages

http://roborumble.org/home

What is it?

Two things:

What's Done?

What's Not Done?

News

I put Phoenix 0.1 and 1.02 onto it as a test. Feel free to upload your own bots. I'll make sure to keep them when I update the application. Bots can be downloaded with urls like this:

http://roborumble.org/download/Phoenix

More generally, roborumble.org/download/<somerobot> will always download the latest version of <somerobot>. Specific versions can also be downloaded, if you know you want a particular version. --David Alves

About the whole codesize/properties thing, I think I know what its for, the storage of robots and such right? You could always just parse the file name, as it should do now. And on its first battle have fnl add code so that it returns the codesize, and then determine its league. I don't think you wanna invoke the java runtime, (which here on windows takes up a good 10mb to 20mb of ram).

Also, note, I would be happy to help with the css/xhtml/site design/php. Having been a webmaster the last what, its almost 2008... 4 years. I do valid xhtml strict, and i'm a css wizard (i'm pretty good at complex cross-browser css), and I wrote the backend on the site I administistrate.

--Chase-san

Oh also, could you make it so people can use different flags instead of the one for thier country, as that has happened before (they may tell everyone where they live, but wish to represent another country). --Chase-san


It is planned do have a completely new Robocode server, right? I think we should think about other rating systems. The current one is good at estimating the rating without many battles, but has the downside of rating-drifts. I think we should base our new score just on the average score percent (e.g. score = accumulated score / number of Bots). A nonelinear translation into values around the current rating points would be nice. But i doubt we could not get the exact same values. --Krabb

I did send him my prototype of the ELO base system, he just hasn't done anything with it yet. (it worked, it was just lacking large data tests and such) --Chase-san

Hmm, isn't the current system also ELO based? I think an ELO system is useful in cases where you have a small number of games and many competitors (Like chess rankings or online games like Warcraft III). A human player can only play a limited amount of games, but we can calculate hundreds of robocode games per day. The problem with ELO based systems is, that your rating is dependent on your opponents rating(wich depends on its opponents as well...), therefore your rating can't be stable. --Krabb

We should just switch to PremierLeague across the board. =) -- Voidious

That might work, but I like the ELO system alot, but true, a PremierLeague setup would be more stable. --Chase-san

The 'total % wins' column in the PL ranking already contains the accumulated score percentage/100, almost the same scoring Krabb suggested when starting this discussion. It has the advantage of being stable when all pairings are done, but not the rigidity of the PremierLeague as it still matters how well you beat others. The ranking would be very close to the current ranking. Currently DrussGT would score 520/606*100 is 85.81, while both Ascendant and Shadow would score 510/606*100 is 84.16. I don't like the rigidity of the PL and I also don't like the ratingdrift, and this system would combine most advantages and some disadvantages of both systems. I don't think that the outcome could be mapped to the current rating though (85 => 2100). Another small advantage: scoring like 5763 to 0 now would count instead of being ignored. -- GrubbmGait

It should not be that hard to map the the new score to our old rating system. Looks like a simple polynomial would do the job:

http://designnj.de/roboking/rating_table.JPG

What do you think? --Krabb

Hmm, I just made an account on Roborumble.org to test, however I accidently put the country down as Afghanistan instead of Canada, and I can't see any way to change it. Hopefully things like changing that will get implemented or that could get fixed some time ;) Nice stuff for how this is looking so far. I'd be willing to help if only I knew Ruby. -- Rednaxela

If I try to upload a bot, somewhere it is making it a nil object and breaking somehow. I can post full error mesage if you are unaware of this problem. --Baal

there is an error if in the "Robots & Teams" page click an uploaded robot, if the robot is not uploaded everythink seems fine -- asdasd

Well, as it stands right now, Roborumble.org isn't really working but I hear David is working on it. Hm.. maybe I should learn some Ruby to help out on this... -- Rednaxela

To change the subject back, I too would like to see a scoring more like a simple average percent of total score. No fancy math after that, every bot just gets a rating between 0-100. Or multipy by 10 to get 0-1000, that might be cooler. I see no need to map to attempt making the scores look the same as they do now. --Simonton

Hmm, I think such an average rating would be good too, just so long as we keep the PL rankings too, after all it's the easiest way to see someone Undefeated. Personally I'd rather 0-100 than 0-1000 but that's a matter of personal taste and I'd still want 4 digits of precision. Only problem (not a serious one) with changing the rating system like this, is The2000Club might be in need of a replacement =P. One other thought, is it might be nice to use some statistical methods to calculate "estimated error" values, to give an estimate about how much a ranking might be affected by lack of battles. Not a critical feature nor would it affect ranking/ratings, but a would-be-nice-to-have thing. -- Rednaxela

Good idea w/ the error values - but I guess I'm not sure how they would be calculated. What would you suggest? The only thing I can think of is to keep a history how much influence x battles have on total score after y total battles, then use the average. But I definitely want to keep the premier league, too. Also, maybe the scores should be multiplied by 100 - that way we can see 4 digits (I'm just not a fan of decimals in the scoring) AND we would get things like The8000Club?, which sounds WAY better than The2000Club (like 4 times better!). --Simonton

I'm all for it =) I need something new to work on with DrussGT. Moving the goalposts will do just that. -- Skilgannon

No matter how you twist the rankings DrussGT is the goalposts... :) -- ABC

I'd certainly call the PL a "twist" in a way, and in that way, DrussGT isn't quite the goalposts... besides ABC, you're the one that sets the goalposts in melee, holding both the top place and 2 others in the top 5 in the megabot melee ;) -- Rednaxela

Well, put it this way: if improving 5% against top bots will boost my score as much as improving 5% against low ranked bots, I'll get around to adding other things, like an anti-surfer gun. I've finally figured out why improving scores against low ranked bots helps more than against high ranked bots in the old system: as your rating increases, the expected score doesn't change as much due to the ELO system, so it will contribute to canceling out ProblemBots. Against mid/high ranked bots the expected score changes more as your score increases, so they no longer contribute to canceling out ProblemBots, and may even become ProblemBots. Make sense? -- Skilgannon

The fact that your score wouldn't depend on beating low-ranked bots more than any others is a strong reason to move to this kind of system, in my opinion. --Simonton

To compare, ELO is mostly a measure of how much you thrash the weak, average score is how well you fight overall, and PL score is mostly a measure of how you overcome the strong. I think each of those are interesting and have their place, but at least personally, ELO carries less interest than PL and average score. -- Rednaxela

I believe that the % score ranking will end up exactly the same as the ELO one, but will have some (important, imo) disadvantages. You guys are wrong thinking that in that setup beating the low-ranked becomes less important, if anything it will be more important. With a 1% score increase meaning exactly the same against anyone, getting those easy points will be essential. With ELO, 5% against top bots will boost your score as much as against low ones Skilgannon (it only depends on how far you are from the expected score curve), but it's very hard to do it without losing more that that against the middle ones! -- ABC

        EstimatedScoreFraction? = 1 / (1 + 20^(-RatingDiff?/800) )
        NewRating? = OldRating? + CorrectionConstant? * (ActualScoreFraction? - EstimatedScoreFraction?)

For an improved variant of the old ELO system, you can also check out the Glicko systems. For details have a look at the website of Prof. Mark E. Glickman, the inventor of them: [1]. (I haven't followed all the discussion, but i still think something like this gives very good ratings.) -- Qohnil

@Simonton: I think the first test you should make is to generate a APS ranking and compare it to the current (ELO) one. I believe they will be exactly the same, except maybe in some very close cases where ELO will serve better as a differentiator, IMO. Also, I always wanted to know what would happen if we increased the round number of the rumble battles. If we ever get that kind of ranking (100 round battles, f.e.) full pairings become harder to get, but with ELO we can get a meaningful ranking much easier. -- ABC

Wrote something quick: Nfwu/EloSim. Basically simulates what the server does. RR@H source is from Albert's website: http://www.geocities.com/albert_pv/RoboRumbeAtHome.html This decay (wins1 = 0.7 * wins1 + 0.3 * real1; wins2 = 0.7 * wins2 + 0.3 * real2;) looks interesting. For a test on: (A/B=25/75, A/C=75/25, B/C=25/75)

...
A > C > B in terms of rating. Or I may have screwed something up. I did screw something up. A, B, and C have very close rankings, but are different on every run. -- Nfwu

Robo Home | Changes | Preferences | AllPages
Edit text of this page | View other revisions
Last edited September 2, 2008 8:19 EST by Nfwu (diff)
Search: