Hello all,
because of a wmf-master-change last week (when Nosy and I were in the
datacenter) and a mistake on our side, the data of s1 which were inserted
after 18. January are proberly defect or wrong.
I already requested a new dump and will inform you, when there is any progress
or news.
Sincerly,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hi everyone,
Nightshade was a bit slow so I typed "top -c". I was amazed to see that
almost all the top processes seem to be interwiki related
(interwiki.py). Same seems to be the case at willow. Normally I wouldn't
really care, we have the servers so we should use them, but now the
login servers seem to be overloaded. Isn't this a bit too many interwiki
bots?
Maarten
Hi;
I'm getting that error when querying English Wikipedia. I add /* SLOW_OK */
to my queries. What is the sense of adding SLOW_OK if my queries are being
killed anyway?
Regards,
emijrp
Hallo!
I'm needing a list of all active Wikipedia bots in all language versions
in order to judge how important the edits are which I read from the
table `revision' from databases like `dewiki_p'. I want to discriminate
bot-edits from other minor edits by checking if the corresponding user
name is that of a bot.
I could, of course, extract the names of all bots from HTML by parsing
the page about `All Wikipedia bots'
http://en.wikipedia.org/wiki/Category:All_Wikipedia_bots
But very likely there's a much more convenient way, like an sql table
containing the bots.
(The list of bots running on the Wikimedia Toolserver
http://en.wikipedia.org/wiki/Category:Wikipedia_bots_running_on_the_Wikimed…
contains to few bots for my purposes.)
Best
Philipp
Hello all,
Nosy and I will be in Haarlem (NL) this week (thursday and friday) to perform
hardware-maintenance on the TS; several services or servers will be down
during the maintaince and there will be no schedular what will be down when.
We will try to be in the IRC and announce there, but I can not promise that.
We hope that the TS will be avaiable completly during the night-hours (23-6
URC), but it can happen that some parts where missing.
Our goal is it the expand the disc-capacity of the TS and add some redudancy.
Also we will repair some stuff and will do much of documentation.
Sincerly,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
For the pywikipedia-l listeners just tuning in: the toolserver has an
overload of interwiki bots, and we want to reduce this. As such, we want to
switch to a single bot that runs all the interwiki updates from the
toolserver.
On 16 January 2012 09:19, Merlijn van Deen <valhallasw(a)arctus.nl> wrote:
> The only reasonable action we can take to reduce the memory
> consumption is to let the OS do its job in freeing memory: using one
> process to track pages that have to be corrected (using the database,
> if possible), and one process to do the actual fixing (interwiki.py).
> This should be reasonably easy to implement (i.e. use a pywikibot page
> generator to generate a list of pages, use a database layer to track
> interlanguage links and popen('interwiki.py <page>') if this is a
> fixable situation)
>
>
I took some time yesterday to work out some details on this - see
http://piratepad.net/T29Uj4j1U4 . It boils down to this:
1) generation of a list of pages to work on: from the database, if possible
2) dispatching interwiki.py based on that list of pages and handling logging
3) interwiki.py itself
My suggestion is to split these tasks, and creating a simple interface
(e.g. WSGI) between 1) and 2), and using subprocesses for 2) to 3).
Yesterday, I have been working (mainly) on speeding up the startup of
interwiki.py, so that we can spawn one process per Page.
On the Toolserver side, I would appreciate any comments/work/existing work
on the creation of an interwiki graph from the database - there are already
tools that suggest images based on interwiki links, so this code should be
around - and hopefully be adaptable. The only goal for this process would
be to create a list of starting pages interwiki.py can use - i.e. graphs
with one or more missing links, but without any double links.
On the Pywikipedia side, some thoughts on running interwiki.py in a new
process would be welcome. e.g. how can we improve startup time ('kill all
the regexps!') and effectively spawn multiple processes to run. What
parameters (throttles?) should be tuned, et cetera.
Best,
Merlijn
https://blog.wikimedia.org/2012/01/16/wikipedias-community-calls-for-anti-s…
Editing pages on English Wikipedia via the web service API will be
disabled for 24 hours beginning at 05:00 UTC on Wednesday, January 18,
as part of the anti-SOPA/PIPA blackout.
Please forward and publicize so people who use and run bots are alerted.
--
Sumana Harihareswara
Volunteer Development Coordinator
Wikimedia Foundation
Hello all,
like already announced after the last maintenance, the next maintenance will
be at
Wednesday, 7 December between 19:00 and 1:00 UTC.
The roots will collect what they will do at [1] until Sunday night. If you
have something for us to do (like a software-update) please open a bugreport
at JIRA and make sure to add the label "maintaince-window" until Sunday noon.
The roots plan to finish the configuration of Apache until Wednesday too, but
there will be no switch yet to give you all some time for testing (I will send
a eMail with more details when the time is right).
Sincerly,
DaB.
[1] https://wiki.toolserver.org/view/Admin:Next_maintenance
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hello all,
we are should be in 2012 now, so I would like to wish you all a Happy New
Year. Let us hope that the toolserver will run smoothly, stable and fast in
this year, that it will be extended as far as it is needed, that new and old
users will write wonderfull tools, that people continue to use new and old
tools and that the roots will not cause more trouble than needed.
Happy New Year!
Sincerly,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885