hello,
something apparently went wrong on zedler last night, but i don't have time to piece together what it was from IRC scrollback (and no-one's let me know what happened), and MySQL replication is still broken.
could someone please tell me what happened so i can fix it (or else fix it themselves)?
if multiple people will perform root tasks on zedler, i think they should at the minimum be logged either to this mailing list or the Wikimedia server admin log (https://wikitech.leuksman.com/view/Server_admin_log).
(having said that, it's nice to know that i don't have to fix everything myself now :-)
k.
Hi Kate
Short version:
Some cron-job run by apper filled the disk. It produced about 240GB of data in two files in /tmp - I guess that was also the RAM problem the other day, sind /tmp was in RAM, right?
Anyway, datura killed the processed, deleted the files and commented out the respective jobs in the cron tab. MySQL came back on its own, afaik, but no one knew how to restart replication.
And yes, an admin log is a good idea ;) Oh, and thanks for the great job you are doing for us!
Regards, Daniel
Hi,
Daniel Kinzler daniel@brightbyte.de schrieb am Wed, 25 Jan 2006 13:06:34 +0100:
Some cron-job run by apper filled the disk. It produced about 240GB of data in two files in /tmp - I guess that was also the RAM problem the other day, sind /tmp was in RAM, right?
sorry for that. Datura told me about that yesterday, I fixed a problem in the script, which produces a problem (i thought it was the problem) and tested it this morning by changing the cronjobs to the morning and watching running processes: and all worked fine. Why it doesn't work in the evening I don't really know... it uploads a file to the commons, maybe if this times out because the server is overloaded, the problem occurs... I'll have a closer look at it, when I have more time, until then i deactivated all cronjobs...
So sorry again :/
Sincerely Christian Thiele
It's probably a good idea to move /tmp off the root filesystem to prevent something like this from happening, normal users shouldn't be able to leave the system unoperational by creating a large file.
Could per-user process and space limitations be put in place instead, or along with /tmp/ preventative measures? I know per-user restrictions are less flexible, but if one of my scripts goes haywire, I'm far happier if it just hoses my home directory and keeps me from creating new processes, and doesn't do that to any other users.
-Dave
On Wed, Jan 25, 2006 at 04:04:47PM +0000, ??var Arnfj??r?? Bjarmason wrote:
It's probably a good idea to move /tmp off the root filesystem to prevent something like this from happening, normal users shouldn't be able to leave the system unoperational by creating a large file.
Toolserver-l mailing list Toolserver-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/toolserver-l
toolserver-l@lists.wikimedia.org