No servers available?

Ask for help regarding any technical issue or report any bug or OS independent issues.
User avatar
AnonDuck
TMW Adviser
TMW Adviser
Posts: 645
Joined: 02 Jan 2009, 04:19
Location: Catland

Re: No servers available?

Post by AnonDuck »

Here's why it was down then:

The login server was laggy. The mmo_auth_sync() function that writes the accounts database was taking an inordinate amount of CPU time due to the many accounts on TMW. Being a single-threaded daemon all other processing is suspended during this operation, leading to horribly long login delays. The problem was possibly exacerbated by the web-based account creation script which likes to repeatedly call ladmin with bad parameters, possibly leading to more calls to mmo_auth_sync() than needed.

Several solutions were tried to mitigate this problem. The login-server was calling mmo_auth_sync() whenever any changes were made to the accounts DB(a lot), so first I removed most calls to the function and set it to be triggered on a 5 minute timer. This was not very effective, again possibly due to the ladmin issue, as I left mmo_auth_sync() in there assuming anything from ladmin could be considered trusted input and should trigger an immediate DB write.

The next attempt involved removing all calls to mmo_auth_sync() except the one triggered by the timer. Jaxad had estimated that it takes around 30 seconds to dump the accounts DB to disk, so it was decided that it would be a good idea to fork(2) off the mmo_auth_sync() function to a child process. This solution had proved successful with the character server and had almost totally eliminated it's lag issues(including party/whisper lag). I pushed the patch and went to sleep. While I was drooling on a pillow, Jax had pushed this code live.

Upon waking up I found that TMW was down. Unfortunately the login server operates slightly differently than the char-server, so what appeared to be a cut&dry copy/paste fix went a bit awry. The char-server relies on a SIGINT handler to clean up after itself when it exits. The login-server had to be funky and uses atexit(3) semantics to run additional code on exit. Since the code that forks off writes calls exit() when the child process is through running mmo_auth_sync(), the atexit() function is being called. I won't bother with a full stack dump here, but the code the atexit() handler is calling eventually closes down all sockets in the process. Now if you know anything about POSIX forking semantics you would know that if a child closes a socket shared with the parent process, it's closed for the parent process also. This led to a condition where the parent login-server process was running it's main select()/accept() loop on a closed socket, spewing errors to the log and chewing 100% CPU. Nice. The solution to this issue was to replace the call to exit() with _exit(), which bypasses the atexit() handler and exits immediately. After pushing these changes, Jax pushed them to the main repo, restarted the server.. and here we are.. It seems there still might be problems.

Satisfied with the explanation? It took me 10 minutes to write up. Time I could have been spending looking into this further.
Head of the TMW Illuminati
User avatar
Big Crunch
TMW Adviser
TMW Adviser
Posts: 1056
Joined: 16 Dec 2009, 22:52

Re: No servers available?

Post by Big Crunch »

Thank you for taking the time to explain. I assume that this was posted as a legit explanation and not an attempt to be an exaggeratedly precise post. I would have been satisfied with 'we are having some software issues and we hope to have it taken care of in the next few hours.' I appreciate the level of detail you replied with however. It indicates a high level of commitment to those of us who depend on you guys.


BC
sexy red bearded GM
User avatar
meway
TMW Classic
TMW Classic
Posts: 1737
Joined: 04 Jan 2009, 05:02
Location: Detroit MI

Re: No servers available?

Post by meway »

Thank you MC but really in the mean time if you have nothing better to do meway.ath.cx GM playground. ^_^ just for now.
thedarkfinder
Novice
Novice
Posts: 136
Joined: 21 Dec 2008, 02:18

Re: No servers available?

Post by thedarkfinder »

Mad Camel

Thank you for getting our beloved game back up.
User avatar
Jaxad0127
Manasource
Manasource
Posts: 4209
Joined: 01 Nov 2007, 17:35
Location: Internet

Re: No servers available?

Post by Jaxad0127 »

There were some issues with some bug fixes to the login-server. It took a few tries to get everything working right. Everything should be fine now.
Image
User avatar
Big Crunch
TMW Adviser
TMW Adviser
Posts: 1056
Joined: 16 Dec 2009, 22:52

Re: No servers available?

Post by Big Crunch »

It is fixed and working better than ever i might add. Thanks guys.
sexy red bearded GM
User avatar
meway
TMW Classic
TMW Classic
Posts: 1737
Joined: 04 Jan 2009, 05:02
Location: Detroit MI

Re: No servers available?

Post by meway »

Big Crunch wrote:It is fixed and working better than ever i might add. Thanks guys.
yes, problems with lag I was having before do not seem to be as presented as before. I actually receive no lag now :D
Post Reply