Hello all,
it is finally done, the linux-login-servers (yarrow and nightshade) are ready
to use :-). You should be able to login from the outside to these boxes and to
hop from any other toolserver-host too.
I installed most of the packages at [1]; if any package is missing, please use
JIRA now to request it. We are not able at the moment to package software for
which there is no debian-package, because we miss a build-host (I willl speak
with the WMDE about that).
We run Debian stable on the linux-boxes, and both servers (like willow too)
are managed by puppet to keep them in sync in manner of configuration and
software. I installed some aliases (mostly gsometing->something) to make to
switch from Solaris (back) to linux easier for you – if I miss a important
one, please talk to me in IRC or open a jira-ticket; for less important ones
you can define them yourself in .bashrc or .profile in your home.
While you can (at least at the moment) use the cron-system on linux, you
should not – use the submit-hosts and SGE! Speaking of SGE: To let your sge-
jobs run on linux only, you have to add "-l arch=lx" as parameter. By adding
"-l arch='*' " you let SGE decide the best server of both operating systems
for your job. This will always result in a minimum waiting time and is the
recommended usage if possible. Solaris will be the default for SGE until 30.
September 2012, after that the default will be switched to any available os (I
will send a reminder or 2 somewhen before).
If there are any problems, please open a JIRA-report and the roots will try to
resolve them. What will not happen: Installing of packages from testing or
unstable, or installing of non-packaged-software (see above).
The next step in my plan, is to convert our webservers to Debian (and using
apache instead of ZWS). The step will start on Monday or Tuesday with the
removal of wolfsbane from the web-cluster (so ortelius will be the only
webserver for some time). I doubt that this step will take as long as the
previous (because most work in puppet is already done), but you can never
know.
I would hereby like to thank Merlissimo; without him the conversion would have
been much harder or impossible. Another thank to Nosy who freed me from
Solaris-work.
Sincerely,
DaB.
[1] https://wiki.toolserver.org/view/User:Dab/Debian-Packages
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hello all,
there was a mail-sending-problem[1] on the linux-boxes. The problem is fixed
now and the mails are on their ways. Just to let you know if you wonder where
all these old mails are coming from (I doubt the issue was serious, otherwise
someone had notice me before about the missing mails).
Sincerely,
DaB.
[1] https://jira.toolserver.org/browse/MNT-1256
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello all!
I have a bunch of cronie jobs calling qcronsub for several times
with very similar settings (just the language of the wiki used
changes). In total there are 5 jobs - regurarly (about ever 2nd
day) one of those jobs does not get executed and it is always the
same one. I do not get any error mail. Any idea?
Thanks a lot and greetings
DrTrigon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk/a+QYACgkQAXWvBxzBrDD5VQCgox3+fPvOxE1CLry5pdA7AMx8
bDQAnjfAsdLAcykRA5j8lyicyVdk8xYC
=UeJJ
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello everybody!
I experience a behaviour with SGE that I cannot understand.
Due to logging my scripts output I redirect stdout and stderr.
When running the script from console (e.g. on willow) all
write actions to stderr are passed as they are, but runned
through qcronsub (SGE) results in all writes splitted up at
newlines '\n' as if there was some kind of auto-flush in
background? Changing the usage of param 'j' in SGE did not
help. I am using python. Is there something in qcronsub that
has in influence to this? Any idea?
Thanks a lot and greetings!
DrTrigon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk/XmR4ACgkQAXWvBxzBrDCkJgCdFXFTxWbkHn4RR8s9/RBrouzj
Jz0AmwT+Jo4bYwjm+0nybGu5o+Vq+iaK
=YYoN
-----END PGP SIGNATURE-----
As most of us already know, replag on enwiki has been going up and up
since about 30 June. As it says on status.toolserver.org, "Hight replag
because of inserting of many SHA1-hashes." (Note to DaB.: the first
word should be spelled "High".)
I asked DaB. on IRC how long this might go on, and he replied one to two
weeks. However, I've since done some independent investigation that
suggests that his estimate might be a little low.
It turns out that there are three large blocks of consecutive entries in
the revision database that need to be populated with SHA1 hashes.
Apparently there are three processes running in parallel on the WMF
servers that are filling in each of these blocks from the bottom, by
numerical order of rev_id. Knowing this, we can estimate how many
revisions still need to be populated at any given point; and, taking
such estimates at various points in time, can estimate how long the
process will take. (Needless to say, this is only an estimate since the
rate at which database changes are processed on the toolserver side is
variable; also, the blocks of rev_ids are not actually consecutive due
to deletions, but we can assume for our purposes that the deleted
revisions are distributed uniformly throughout the database.)
It further turns out that it is only possible to compute this estimate
for sql-s1-user (thyme), because the enwiki_p view on sql-s1-rr
(rosemary) does not have the rev_sha1 field at all (!). It appears that
the server on rosemary is receiving millions of database updates each
day from WMF and throwing them in the bit bucket.
Anyway, based on four observations spaced at 6 hour intervals, it
appears that thyme is populating about 353,000 revisions per hour, or
8.5 million per day. A simple trendline analysis shows that, at this
rate, completing the 230,000,000 remaining unpopulated revisions will
take about 27 more days (estimated completion Aug 6 at 17:48 UTC).
Anyone who relies on use of the enwiki_p database should expect a
prolonged continuation of degraded service and steadily increasing
replag.
--
Russell Blau
russblau(a)imapmail.org
Hi,
I am trying to login to toolserver.org to debug some web scripts.
However, it is refusing my SSH connection, after authentication, the
server closes the connection. Is this a known problem, or is it no
longer possible to login on the webserver?
Bryan
Hello all,
the resources on the TS are free, but limited; so we all have to use the
resources fair. Some limits (like memory-usage) are set and controlled by the
system, but others are not and it is in the responsibility of every single
user to make sure to not mis- or overuse resources.
So it is for example NOT a good idea to run 200 processes in parallel to get
more CPU-resources than you would normally get. And it is not a good idea to
use a amount of memory which is just below the slayer-daemon-limit without any
purpose.
Sincerely,
DaB.
P.S: It is totally in the rules to disable a user-account because of resource-
misusing.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hello,
today I was appointed to admin of this mailing-list by the WMF, because it was
orphaned (thanks to Thehelpfulone for organize that). I changed the ml-name as
suggested, removed River (thanks for your time and work) as ml-admin and added
Nosy as ml-admin.
Just to let you know.
Sincerely,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hi all,
WMUK is currently advertising for a full-time Developer position. The job description, and info on how to apply, is at:
http://uk.wikimedia.org/wiki/Developer_job_description
My apologies for this post being slightly off-topic, but I'm hoping that this position might be of interest to some subscribers. If you know of anyone that might be interested in this position that isn't on this list, please pass the link on to them!
Thanks,
Mike Peel
Wikimedia UK