Do you use the same replication script for enwiki and jawiki? A jawiki user is reporting data corruption issues with a jawiki pageID...
> SELECT * FROM jawiki_p.page WHERE page_id=588373;
If this is a data corruption issue that was recently fixed, would it be possible to replay jawiki to fix the corrupted data? If so, do you have an estimate when that might be run?
I just got about 7000 hits for CatScan in the last 60 minutes -
originating from zedler itself. This cause replication lag to rise
nearly linearly over that time. So, whoever is doing that:
Spidering tools is generally a bad idea, and if you have access to the
database yourself, it's plain silly. In any case, before abusing the web
interface, simply ask me to provide the data in an easy to process manner.
it seems the current PHP version on the toolserver has a weird bug,
according to several online sources. Google for:
php "Cannot use string offset as an array"
At least one of my scripts is suffering from this. Can someone please
upgrade to the latest PHP? I hope that will fix it. In the meantime, I'm
trying to code around it, with little success.
Today, we had another replag problem, caused by Aka's user stats script.
Replag kept rising even after the script was stopped (not killed).
Apparently, the reason lies with the transaction mode used: a long
running query may lock (parts of) the tables it uses, denying any
updates the replication daemon wants to make.
As far as I know, this should not happen if the long running query uses
READ UNCOMMITTED (dirty read). Using it should not cause serious
problems in our context, save a bit of resources and allow replication
to continue while reading from tables. Details on transaction isolation
levels can be found here:
DaB. has set the global default isolation level to READ UNCOMMITTED for
now. If you are experiencing problems or need data that is guarantied to
be consistent under all circumstances, *and* your query does not take
long (no more than, say, a minute or so), you can change the isolation
level like this:
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
REPEATABLE READ is MySQL's "normal" default, i.e. what you have been
using until now if you did not set the isolation level explicitly. There
are other options, look at the documentation link above.
I hope this actually does solve the problem, and does not cause you too
The toolserver is currently bogged down, again, by several massive, long
running queries by kmartin and edwardspec (guys, think what you are
doing before launching a query - and if you kill one, make sure you
*really* kill it).
As often, there's no root user online to take care of the issue (rob,
please come back...). So, I have been thinking:
How about giving all users (or at least a group of "senior" users) a way
to kill long running queries (if not whitelisted)? It has been policy
for some time now that anything running more than an hour unannounced
can be killed - so why not allow more people to do it?
I think this could be implemented as a stored procedure - or two,
actually. One for listing queries that are eligible for killing (long
running, not whitelisted), and one for actually killing it per ID. Would
that be possible permission-wise? I.e. elevate permissions in a stored
procedure? Alternatively, this could be invoked from the command line
using a small executable with setuid flag.
Any such action should be logged, of course, and ideally an email should
be sent to the affected user automatically. So, what do you think?
Alternatively, we would have to make sure to have a root around 24/7 -
or perhaps install an automated email alert if replag rises above an
hour or so? Can you think of a smart way to resolve the issue?
PS: I'm aware that long running queries are not the only thing that may
bog down zedler. But it's something that happens frequently, and is
quite easy to detect. So I propose to start there. The next step would
be memory hogs, i guess.
apache logfiles (/var/log/apache/access) have now been modified to not
include the remote client's IP (it will show 127.0.0.1 instead). real
logs are available in /var/log/apache/private/access for people in the
With the en part of Duesentrieb's CheckUsage down since forever, I
hacked a quick workaround script using query.php. Example:
Maybe that could be transcluded into the *real* CheckUsage tool. That
aside, it does work for other language= and project= parameters. There's
no interface, however; you'll have to use title=XYZ.
first I wish to thank all the toolserver staff for the great service
they provide, but I also have a couple of questions, about how the
1. May I add a pubkey to my local permission file in order to allow
another person to login with my account? It would be needed in order to
allow another wikipedian to take care of the bot when I cannot be
online, and I think it's safer than sending my private key around.
2. I have some encoding problem, even setting
it happens that some characters (especially é and ì) aren't rendered
correctly. I'm sure I'll six this sooner or later, but if some of you
already had the problem any help is appreciated.
Bye and thanks again
Lorenzo `paulatz' Paulatto
``Grandissima mi par l'inezia di coloro che vorrebbero che Iddio avesse
fatto l'universo più proporzionato alla piccola capacità del lor discorso.''
--Galileo Galilei (Opere VII)
Email.it, the professional e-mail, gratis per te: http://www.email.it/f
Ascolta tutta la musica che vuoi gratis!
* Clicca su www.radiosnj.com
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=5176&d=8-7
I've been wondering recently what the status of Hemlock is. I was
told that as soon as river had his laptop working again, work would
be restarted on setting up Hemlock. What is obstacle causing the
current delays in Hemlock's setup?