We've been receiving messages from this domain at unblock(a)toolserver.org
and they appear to be related to this:
Viral advertising for some film. In reality, it's a message with a
crapload of images attached serving no purpose for us.
Can we just block this whole domain from sending mail to toolserver
accounts? It's a nuisance, and the messages are quite large.
If you correspond with me on a regular basis, please read this document:
PGP fingerprint: 2B7A B280 8B12 21CC 260A DF65 6FCE 505A CF83 38F5
This document should be read only by those persons to whom it is
addressed. If you have received this message it was obviously addressed
to you and therefore you can read it.
Additionally, by sending an email to ANY of my addresses or to ANY
mailing lists to which I am subscribed, whether intentionally or
accidentally, you are agreeing that I am "the intended recipient," and
that I may do whatever I wish with the contents of any message received
from you, unless a pre-existing agreement prohibits me from so doing.
This overrides any disclaimer or statement of confidentiality that may
be included on your message.
the linux-servers are now nearly 2 weeks online. While some short- and medium-
running SGE-task are running there already, the number of long-running tasks
is near zero (see graphs at ); in contrast the load on willow is still
It would be nice if more of you could try to move tasks away from the solaris-
boxes to the linux-boxes (or better: make the task so independent it runs on
Please notice that in ~2 months -arch=* will replace -arch=sol as default, so
you should slowly begin to look if your tools are running on linux or not (and
if not how that can be fixed).
A word to the pywikipedia-framework users: At the moment it is unclear if the
old python-unicode-bug is fixed or not on our installation (see ). Testing
(and commenting) is very welcome, but do not run a bot unsupervised.
P.S: If you are still not using SGE, consider it!
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
I'm getting a 502 Bad Gateway error when trying to access any of the
toolserver websites, except for the wiki which appears to simply time
out. status.toolserver.org says it's up, could someone check into this?
to speed-up the decreasing of the replag of s1 (english wikipedia) on thyme,
we will move the user-databases of thyme to rosemary on Wednesday. During the
move the user-databases will be read-only. The moving will start at
Wednesday, 12:00 UTC
and will take a unknown time to finish. We will message you when s1 is
If you have any question, please send them to the mailing-list.
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
(TL;DR? Skip down three paragraphs to the possible workaround....) Last
month, I reported on the progress of SHA-1 updates from the WMF servers,
and noted that s1 replag was likely to continue to be a problem for a
number of weeks. As I said then, the WMF was using (at least) three
processes to populate the SHA-1 field on three separate blocks of
revision records. All these changes then were being replicated to the
Toolserver's copies of the databases, and this flood of updates was
causing the replag.
The three blocks were being populated at different rates (for reasons
that are beyond my knowledge). On July 23 at about 15:00 UTC, rosemary
(sql-s1-rr) completed updating the first of the three blocks. The other
blocks continued to be populated (and at some point the WMF started
another process to help finish off the slowest block), but the rate of
updates was somewhat less, and rosemary actually caught up on its
backlog and reached zero replag within about a day after this milestone.
The situation on thyme (sql-s1-user) is less favorable, as we all know.
The replag on that server got much higher to start with, and thyme
didn't even reach the end of the first block until Sunday August 5 at
about 12:00 UTC. Unlike the situation with rosemary, the reduced load
after this event did not make any noticeable difference to the replag,
which has continued to increase for the past three days at much the same
rate as before. The next milestone will be completion of the second
major block, which looks like it will occur either late on Friday August
9 or early on Saturday August 10 UTC, barring any other major problems
(like the WMF server outage on Monday which caused replication at the TS
end to stop for several hours). At that point, the load from SHA-1
updates should be roughly about 30% of what it had been during July. One
would think that would allow the replag to drop, but since the events of
this week, I can't be confident of that.
There is a possible workaround. The TS could treat this like a server
outage; copy user databases from thyme to rosemary and then point
sql-s1-user to rosemary, which currently has no replag. Rosemary would
then have to handle twice the load, but thyme should start to recover
very quickly with no user-generated queries hitting it. Once thyme has
recovered, point sql-s1-rr to it.
Downsides: (1) this would require several hours of downtime for
sql-s1-user while the user databases are copied; all tools that require
access to user databases would be offline entirely for this period. (2)
it would have to wait until our volunteer TS admins have time to do it.
(3) the added load on rosemary could cause replag to grow there,
although I doubt it would come anywhere near the 14+ days replag we are
dealing with now on thyme. (4) this could all be unnecessary since thyme
might recover on its own once the SHA-1 update load is reduced, although
I don't know any way of forecasting that and experience so far has not
Question for those of you who operate and/or use tools that access s1
(enwiki): would you be willing to accept several hours of service
outage and the other downsides in exchange for getting rid of the 14-day
Maybe I just missed any note about that, but is cron on nightshade running or not? If yes, it does not send mails.
I moved some jobs from willow to nightshade yesterday and got no email confirmation of their running and it seems they actually did not run at all.