Well, no sysadmins have been found in the last two hours -- I suppose I'll try posting here to see if someone comes across it.
We're having massive db lag on enwiki, including a lag on usercontributions of several hours. At present, it appears that every server servicing enwiki is down except for ariel. Tim mentioned something about botched code from the last scap that killed samuel -- is this related to our current problems?
In any case, we need a server admin quite desperately, as enwiki is teetering on death at the moment ...
On 6/3/07, Daniel Cannon cannon.danielc@gmail.com wrote:
In any case, we need a server admin quite desperately, as enwiki is teetering on death at the moment ...
</melodrama> As an update, things do seem to be improving a bit. In any case, it's not a *critical* situation at this point. The databases are still lagging quite a bit--four hour lags on usercontributions of course the most noticeable. Server lag seems to have cleared itself up considerably. Still would be nice if a sysadmin could perchance restart the db servers or look at (and preferably fix :D) the problem. Thanks.
I have a few questions about this request,
1. What sort of prior experience is required?
2. What sort of commitment is required?
3. Is it likely that the age restriction will be waived if the need becomes great enough and there is little input from non-minors?
4. Could the age restriction be removed if a special MySQL user was created which did not give access to the sensitive tables?
Thanks.
On 04/06/07, Daniel Cannon cannon.danielc@gmail.com wrote:
On 6/3/07, Daniel Cannon cannon.danielc@gmail.com wrote:
In any case, we need a server admin quite desperately, as enwiki is teetering on death at the moment ...
</melodrama> As an update, things do seem to be improving a bit. In any case, it's not a *critical* situation at this point. The databases are still lagging quite a bit--four hour lags on usercontributions of course the most noticeable. Server lag seems to have cleared itself up considerably. Still would be nice if a sysadmin could perchance restart the db servers or look at (and preferably fix :D) the problem. Thanks.
-- Daniel Cannon (AmiDaniel)
http://amidaniel.com cannon.danielc@gmail.com _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 04/06/07, Robert Leverington lcarsdata@googlemail.com wrote:
I have a few questions about this request,
A system administrator is expected to have experience managing Unix and Linux systems, because that's what they do.
Wikimedia system administrators also have a lot of trust from both the communities and the existing team. Trust is most important.
Rob Church
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Robert Leverington wrote:
I have a few questions about this request,
What sort of prior experience is required?
What sort of commitment is required?
Is it likely that the age restriction will be waived if the need
becomes great enough and there is little input from non-minors?
- Could the age restriction be removed if a special MySQL user was
created which did not give access to the sensitive tables?
Thanks.
I'd like to point out that this was a request for a sysadmin to assist, not an advertisement of a job position, as far as I can read into it :)
~ Paul Williams
On 6/4/07, Paul Williams paul@skenmy.com wrote:
I'd like to point out that this was a request for a sysadmin to assist, not an advertisement of a job position, as far as I can read into it :)
That's correct, there was no request for *new* sysadmins here. And if there was, Daniel certainly wouldn't be the one to post it, any more than I would. :)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Daniel Cannon wrote:
On 6/3/07, Daniel Cannon cannon.danielc@gmail.com wrote:
In any case, we need a server admin quite desperately, as enwiki is teetering on death at the moment ...
</melodrama> As an update, things do seem to be improving a bit. In any case, it's not a *critical* situation at this point. The databases are still lagging quite a bit--four hour lags on usercontributions of course the most noticeable. Server lag seems to have cleared itself up considerably. Still would be nice if a sysadmin could perchance restart the db servers or look at (and preferably fix :D) the problem. Thanks.
Yes, that was dealt with.
A bug in the new image code caused something to try to update things on the Commons database on the English Wikipedia database server, which caused errors breaking replication.
Once the bug was fixed, we added a dummy Commons table on the English Wikipedia server so that the replication entries would go through without complaint; by that time replication was about four hours behind, but it's all caught up by morning.
- -- brion vibber (brion @ wikimedia.org)
Daniel Cannon wrote:
Well, no sysadmins have been found in the last two hours -- I suppose I'll try posting here to see if someone comes across it.
I think we should have an escalation procedure for downtime that involves calling the Wikimedia office as a first step. Out of hours, a recorded message would give a special number to call which is diverted to, say, the mobile phone of a sysadmin.
I was awake at the time, but busy with non-Internet stuff. I could have responded to a request by phone at any time. I fixed the problem as soon as I by chance turned on my monitor and saw the complaints on IRC. Brion arrived at the same time, presumably someone had called him.
Last time I gave out my mobile number to a group of active Wikipedians, a member of that group turned troll and posted that number on Wikipedia in an attempt to get back at me for blocking him. Nobody ever called, but still, I'm slightly hesitant to do it again. Having a separate diverted number instead of distributing sysadmin contact details means we can implement call screening, time of day restrictions, etc. should the need arise.
-- Tim Starling
On 04/06/07, Tim Starling tstarling@wikimedia.org wrote:
still, I'm slightly hesitant to do it again. Having a separate diverted number instead of distributing sysadmin contact details means we can implement call screening, time of day restrictions, etc. should the need arise.
That sounds like a solid idea, given that our system administration team is so geographically dispersed.
Rob Church
wikitech-l@lists.wikimedia.org