-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
on Monday, 6 April between 5-7AM UTC, scheduled quarterly maintenance will take
place on the following servers:
zedler (s2), rosemary (s1), yarrow (s3), hemlock, amaranth, willow (stable)
expected downtime for each server is less than 15 minutes.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAknQ3OMACgkQIXd7fCuc5vIDhgCfWY1P+D+/wGdyv6BfLEfpVn0n
8XQAnRYKQkYETmEvn4D5UvCabNGjfY1n
=Cfx1
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
at ~07:00, the web server (wolfsbane) became unresponsive and was rebooted.
during this time http://toolserver.org was unavailable. the problem was
resolved about 10 minutes later.
the login server and the stable server were unaffected.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAknPEl8ACgkQIXd7fCuc5vI1NACfXrnwaEF0+9yCQkA6KInmWF2v
aE0AoLukCBsxuDRgRQldkXL0SfxnzBDn
=xCMm
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
at ~15:00, an unplanned outage on the NFS server caused downtime for several
minutes, and a slightly longer outage of the s2 cluster while the database
recovered. the server was rebooted and now seems to be working fine. the
stable server was unaffected.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAknOQtQACgkQIXd7fCuc5vJ9BwCeLGjGbk4R/bzTusBMszY+Ig4u
E90An3G62ko0Mgvsoy8bdytgAyF/Fpab
=jme/
-----END PGP SIGNATURE-----
Hello all,
the load on the login-server (nightshade) is quite high at the moment. One of
the problems, that cause this, is that people don't use nice. The nice-level
tells an operation system how important a program is and how much cpu-power
it should get (to simplify it).
Normal user can set a nice-level between 0 and 19. 0 is the highest, 19 the
lowest value. The default is 0. Longrunning tasks like bots should use a low
nice-level like 10.
The usage is quite easy. Just put a "nice -n 10" before your command. An
example would be
nice -n 10 python redirect.py broken
which runs the program python (that executes the redirect-script) at
nice-level 10.
More details tell the man-page of nice (man nice).
Because I know that a few of you (of corse only new users ;)) will forget
that, I wrote a little program today. The program searchs for long-running,
high-cpu-time-using programs with nice-level 0 and set nice-level 19 by force
(then it sends an eMail to inform the user). At the moment, the programm runs
in test-mode; that means it sets only a nice-level of 1 and sends an email.
If you receive such an email and think that it is wrong, please message me.
Thanks for your attention :).
Sincerly,
DaB.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
i've deployed a new version of MySQL on rosemary (s1) and yarrow (s3) which
contains a fix for the bug where long SELECT queries on wiki databases break
replication. for people who were affected by this problem, you can re-enable
tools which were disabled because of this *after* you test that your queries no
longer cause replag.
(if you re-enable your tools without testing first, i will be annoyed.)
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAknKbpQACgkQIXd7fCuc5vInaQCghOCkrnvaV4oDnvM209bM68hM
7JYAn1JUTDaaP2Dp5TMRLWGHHFo3JK3M
=LDHZ
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
we're now (again) looking for new admins. i originally posted this about a
year ago, but we ended up not being able to add new admins then, so nothing
happened. if you applied then, please feel free to re-apply.
we are looking for someone with experience using the following in a production
environment:
+ General Unix to an expert level (say, 7-10 years experience at least)
+ Solaris
+ general administration
+ patch management
+ live upgrade
+ ZFS
+ jumpstart
+ MySQL
+ Debian Linux
people without this experience may be considered, if you're willing to put in
the effort to get up to speed quickly.
knowledge of the following would be helpful:
+ C and/or C++ programming language
+ PHP
+ Veritas
+ Sun Java System Web Server
+ Zeus Web Server
+ Apache
+ LDAP (Sun DSEE)
some tasks the new admin might start with:
+ designing and implementing a monitoring system to provide current and
historical statistics about the cluster, warning about problems, etc.
+ implementing a network install (jumpstart) framework to enable
(re)installation of hosts easily, with proper post-install setup.
it would be particularly nice to have another 'on-call' admin besides myself,
for when there are urgent problems with the cluster and no one is around.
you should either be a well-known and trusted member of the community, or else
have some kind of professional background (either academic or commercial). we
might require references for the latter, depending on the situation. you will
be required to provide the Wikimedia office with proof of your identity.
if you're interested, please mail me privately (not the list), including some
information about yourself and details of your relevant background/experience,
and an estimate of how much time you might be able to donate. you may include
a CV if you wish.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAkm5oRoACgkQIXd7fCuc5vK7LQCgg9KjbZ5Cajfv3S/Sert2pjIf
zjIAnRe9qm31QU61G+x2cuZVwHnZ8DwZ
=Qyhz
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
we are now backing up user databases (u_%) on all three database servers. this
means the data will be restored if the server fails, or we move to another
server. additionally, the backups are now copied to an off-site server, making
data loss less likely.
please take a moment to look at your user databases, and drop any that you no
longer need; these databases use a lot of space both on the SQL servers, and on
the backup server.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAkmxCKUACgkQIXd7fCuc5vLAzgCgitEFyiyVRskxx6EWqP+7fsMT
88QAn3HOGgk7824tcuulDaRSQYzX2ZLJ
=rQwx
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
wolfsbane and nightshade were both rebooted this morning. i downgraded
nightshade's kernel to the current Debian kernel, to see if this fixes the
issue we've been having.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAkmvZ/kACgkQIXd7fCuc5vLI3QCgs4LsXkl7XafMaLcpnMryD+UG
brYAoJDglv0oaAIMxfIMB8H0i6KfEikj
=xs7o
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
sql-s1 is now pointing at rosemary, and the web server has been moved to
wolfsbane. please report problems with either server by opening a request in
the TS project in JIRA.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)
iEYEARECAAYFAkmr9vQACgkQIXd7fCuc5vLhNgCfVFUysPjaRgNAKbKvdm75bzFJ
73QAn3W0qA6iDoz650XtVYsgRUpF3Q7K
=p5fZ
-----END PGP SIGNATURE-----