[Home]WikiOutage

Robo Home | Changes | Preferences | AllPages

/OldStuff

May 25 2004

The wiki has been dead for hours and hours. I think it is the RoboRumble server that runs amok somehow. Immediately when I start it my server runs out of memory. I'll look in to it later, probably not before 8 pm at May 25th CET though. If the wiki dies on you you know it won't be up very soon either.

Please stop your RoboRumble clients from uploading results until we have a plan for starting it all up again.

Sorry about this folks...

-- PEZ

10:01 CET - OK. So I might have fixed the problem. At least temporarily. The battles log files have grown really huge and it might be that the rumble server tries to feed a lot of it into memory upon startup or something... I moved the log files away and the poor server seems to still be standing now even after I've started the rumble servlets again. If the wiki goes away now I'll be unable to fix it since I'm not home and when it goes away it goes away and I can't reach it from here... -- PEZ

Yeah, and if anyone was alert enough to shut down their clients like I asked above. You can start it up again now. If it works it works. If not, well then I'll try to fix it tonight. -- PEZ

19:40 - Outage again. A "Denial of Service" attack this time. That or a most unpolite spider indexing the wiki. Anyway, blocking that ip-number allows the wiki to serve again. Why can't things stay easy? ... -- PEZ


Ok, i don't know if anyone else has had problems but i can't actually access the wiki from home at the moment. Browser just says its unable to resolve the host name. Is this happening to anyone else, or is my ISP playing up again?? Oh, and please excuse whatever the formatting of this post comes out like, i hate using lynx... --Brainfade

I had the same problem recently. The problem was (well actually is) that my ISP is no longer able to resolve robowiki.net. Luckily http://robowiki.dyndns.org still works. So i can still browse the wiki, except the current rankings of the rumble. --Mue


Same thing as Mue describes above right now (12:40 CET, Aug 15), or is it just me ? -- Pulsar

You mean you cannot resolve robowiki.net? I don't know why this is so, maybe my internet connection has to much latency for a name server... Anyway, you can always exchange robowiki.net with robowiki.dyndns.org. This should be true also for the Rankings servlets. If not let me know and I can probably fix it. If anyone can act name server for robowiki.net, please let me know. -- PEZ

Yes exactly I can't. Hehe I offered my name server services already (read icq! :-) ) -- Pulsar

Big thanks for the offer! But now trying to configure for using that I noticed I can get the name server services for free from my domain name provider. I've updated the records now and hopefully within a day or two everybody should be able to use the robowiki.net domain name. -- PEZ


Aug 17 2004

I'm having problems with the wiki server. I think it's hardware related. The whole thing just freezes on me. Has happened now twice in two days. This might mean the wiki will be offline quite often a while now, until I have gotten ths sorted out... Sorry about this. -- PEZ

September 7 2004

PEZ, the servlets have stopped responding again. I tried logging in an just stopping and starting the tomcat service but the init.d script is telling of an error. I did not want to go any deeper into it than that without knowing what you may or may not have done to it. I hope you get to see this before too long. My new DH is ready to go and waiting to be entered into the rumble :) -- jim

Nevermind. They seem to be working now. Strange. -- jim

September 26 2004

I couldn't reach the wiki for some time. The server responded to ping, but the webserver didn't return anything. There have been no Changes in that period (Teleport 19:09 to GuessFactorTargeting 22:46). -- Jonathan

It responded to ping? Hmm. I think that must have been the firewall responding. Because the server was frozen. It happens often. Something wrong with it that I haven't figured out. Other than it happens easier with higher network load. Not a good property for a web server, is it? ... Well, it's an old and tired machine and I'll probably have to replace it some day soon. -- PEZ

Shall we start a collection? I'm willing to chip in $100 USD to help buy a new server. --David Alves

Wow. I truly appreciate that offer! But it shouldn't be necessary I think. Tell you what... When I get the time to figure out what kind of machine should replace the current one I'll get back to this collection idea if I think it's too heavy for my household budget. Thanks again for the offer! I'm amazed! -- PEZ

September 27 2004

Down for a loooong time just now... --David Alves

Tell me about it... -- PEZ

At least $50 USD you can count on me (Brazilians are poor u know). -- Axe

Thanks. But no donations are accepted yet. =) The problem is more time than anything else here actually. I don't know when to be able to book a time slot large enough to deal with this. But, the problem accelerates it seems. So deal with it I must. The machine didn't want to start up automatically this time. Some open files in the crash are damaged I guess. The RR@H general rankings table is a bit odd. Let's hope it fixes itself... -- PEZ

WoW?! The rankings table gone iNSanE?... MyFirstBot? is the King :P Beatyfull! -- Axe

Do you save all submitted battles? Or are the individual battles just incorporated into the ratings and then discarded? --David Alves

All battles are saved in a log... Ask Albert how it works. -- PEZ

OK. The only times I know the server hangs with pretty high probability is when there are files uploaded or downloaded from the server. This time it was the POD version of SilverSurfer I think. I have closed the uploads script temporarily. And the bots that are currently supposed to be downloaded from this server must be hosted elsewhere for now. David and Pulsar, could you be of service here? Please update the records for SilverSurfer and SilverFist once the new locations are set. -- PEZ

I can host files on davidalves.net or mandatoryexplosions.com. --David Alves

Thanks for the offer, David. How can we do this? Can i send u an e-mail with both versions? -- Axe

I uploaded them from my RoboRumble@Home /robots directory and updated the RoboRumble/Participants page. When you release your next version you can email it to me. --David Alves

Lova ya David! -- PEZ

I love u too, man. Ok, maybe u dont want all that love, but thanks a lot! -- Axe

You're welcome. :-)

RATING DETAILS FOR sgp.JollyNinja 3.53 IN GAME roborumble
CURRENT RATING = 1637.37
Specialization Index = 20.7
Momentum = -4227.899999999999
Do you think the table will fix itself or do we need to (gasp!) reset the ratings? Numbers like this are a little worrisome... --David Alves

I really don't know... Resetting the rankings shouldn't be necessary though. If my backups work I can always bring back the state of last sunday (two days ago).

The server hangs very often now. If someone out there has some knowledge about PC hardware and networking stuff. Please contact me, pez@pezius.com. I don't know how to trace the problem. But it seems to be something with the local network. Could a faulty ethernet card on a machine in the LAN cause hangs on a server when accessing it? -- PEZ

That doesn't sound likely, but I'm not a hardware expert. What is the last entry in the apache log before the server hangs? That might help us tell if it's the wiki scripts or RR@H that's hanging the machine. I'm betting it's a concurrent modification problem with RR@H. --David Alves

Please start a mailing list or something for this. I don't like to display too much implementation details about the server in a forum as public as this one. If you start such a mailing list and invite me to the list use my gmail.com address; (pezius@). -- PEZ

I somehow doubt the problems have anything to do with any particular apache/tomcat setup. The hangs happen at any semi-heavy network activity. Bee it smb, nfs, ssh, http, whatever. -- PEZ

Anyway. The RR@H will be down for a while. I think it is best if you all switch off your clients for now. Then, when we get the RR@H running again we might need many clients dependingon how much data we have lost. Sorry for this guys. I really trusted my backups.... Never do that. -- PEZ

I managed to mix and match stuff a little so I've got the rankings from a month and a half back up now. But as results are uploaded the ranking table should restore itself I think. Without all missing battles having to be rerun that is. Let's see what happens. -- PEZ

I know this is a bit early. But I now have a theory on what's going on. We had accelarating outage problems with the wiki server a few months ago. Back then the servers load rose throw the roof and I was often forced to just switch the power off and on again. That time I tracked the problem down to the RR@H servers battles log files which had grown huge. The reason I haven't suspected the same thing this time is due to two main things; refusal to learn (the real reason) and the difference in symptoms (the excuse). Anyway I think it's the same problem now. So if I just remember to now and then empty the battles log files we should have the wiki and rr@h up much more than the latest days at least. Cross your fingers I am guessing right here. -- PEZ

These things happen, PEZ, no worries. So can we all start our RR clients now? Or do you still need to do stuff? --Vic

No worries? I'm worried like hell when my backups don't work! But yeah, maybe that wasn't clear: Please start your RR clients folks! I think now the rankings have repaired themselves. But we have a few bots that wants to know their place on the ladder. -- PEZ

I understand you are worried. Just wanted you to know that I appreciate a lot what you do for us and you shouldn't feel like you've let us down or something. I'll start two clients right away.... --Vic

Thanks! I know I have made no promises, but it really feels awful when the wiki and rr@h is unavailable. And even worse when I feared we would have to do a full reset of the rankings. But now the immediate crisis is managed I hope. We will need to guard the system a bit better from now on though. If we can mirror the data to some places outside my home we will at least know we can always recover. That kind of confidence is precious. -- PEZ

We still seem to have some broken files though. DarkHallow's details sheet doesn't render fully for instance. I have no real clue why. But I'll try to fix it. If anyone sees other bots who have broken details sheets list them here: /BrokenDetailsSheet please. -- PEZ

And now we had an outage again. I still believe the problem is worsen with huge battle log files. But it seems it can happen anyway. -- PEZ

I just had a nasty thought... random lockups + random text corruption on pages (e.g. "ramminmmintrue" on SilverSurfer/History)... Is the hard drive failing? --David Alves

I don't hope so. PEZ, I have lots of free space on my PowerBook, so I can keep a backup of the Robocode stuff on it if you want. Just in case... -- Jonathan

500GB at your disposal here. ;-) --David Alves

If PEZ had such a big hard drive, he would still have warranty. ;-) -- Jonathan

Thanks guys. I'll look into archiving stuff tonight (CET). I think Pulsar too agreed to keep the wiki data safe. That means we can safe keep things at three places besides the original server. Great! I don't thnk it's the hard drive failing. But I don't know. It would be true horror if it did. But I think a thing like rammmintrue couldn't happen as a result of that without the whole wiki totally breaking down. -- PEZ

OK. Maybe I can leave the wiki up a little while now. I had to rewind stuff a half day or so. Please help out with OperationRecovery. -- PEZ

I have forwarded PEZ's e-mail to the whole robowiki list without first looking if I could reach the wiki, with the wrong subject... but of course it's better than no wiki. :-D -- Jonathan

The robowiki.net -> robowiki.net/perl/robowiki redirection is not working, you get a "Forbidden" error. -- ABC

@Jonathan: What robowiki list? --David Alves

Look in the preferences. -- Jonathan

I have a theory about these major blackouts at RoboWiki... Probably they are attacks of the SFLA (SilverFistLibertyArmy) militia, trying to release their captive leader... -- Axe

Yes, that's probably true! I heard they are now working together with the SPDF (SavePreloadedDataFront?) and those guys take no prisoners I tell ya! They are probably trying to send us a message or something... --Vic

I wish the SFLA success. I am not sold on the SPDF though. I think it is like being given the questions before you take the test. -- jim

SFLA? =)

Now the redirection from robowiki.net/ works again. -- PEZ

A little bird (fish, lets say...) told me that David is going to host now RoboRumble... Thanks, David! -- Axe

My pleasure. Pulsar, can you send me the server classes? --David Alves

Did the american understand the little joke about fish? =) -- PEZ

Pez means fish in Spanish, but I don't think either of you speak Spanish... does it mean fish in some other language too? --David Alves

Dunno, but it´s not common americans speaking other languages but english... :) -- Axe

Eu falo um pouco de português também. :-) But I admit I had to use a dictionary to look up the spelling, I can't read or write it very well. --David Alves

Eu sei. That only proves that u aren´t a common american. The exception that confirm the rule. :) -- Axe

No hablo. You've got that right David. =) -- PEZ


Robo Home | Changes | Preferences | AllPages
Edit text of this page | View other revisions
Last edited October 1, 2004 10:01 EST by PEZ (diff)
Search: