On 3/29/06, Tim Starling <tstarling(a)wikimedia.org> wrote:
[quoted out of order]
Who here needs more than 5 requests per second? Who
needs a latency of
less than a few hundred milliseconds? What exactly do you want full text
5 requests per second isn't enough to keep up with reading 2-6 full
texts per recent change on enwiki which is what my bad edit detection
tool does (depending on the content, more than two because it will
look deep enough to help people fully revert contiguous multi-edit
Enwiki has a bit over 1.2 main namespace edits per second on a 24 hour
average. I need to keep up with the instanious rate on a 1 minute-ish
window, which is much greater, if I don't want the vandalism checking
to lag annoyingly far behind realtime.
This ran with negligible impact on toolserver when we still had text there...
It would be easier if we had a VLAN, so that we
didn't have to set up 5
ssh tunnels. Does anyone know anything about VLANs? Does anyone care
enough about this project to research it?
That's a subject area well within my professional skillset, I'd be
more than willing to look into it.
Why are the list archives private? Why isn't it
listed on the
index page? Why isn't it in gmane? Why have I never
heard of it before? Gregory Maxwell has been whinging that nobody is
listening to him on a list that nobody can read.
Well it wasn't just a whine, I did repeat my offer to help. I also
CCed wikidev list, though I can understand why a single request there
would be lost.
Nor were my requests placed in both the wikimedia-dev irc channel nor
the toolserver IRC channel in an inaccessible place.
It should be possible to set up 5 MySQL instances and
have each of them
replicating from a different master. Is anyone volunteering to set up
those instances? Maybe we need to give root access to someone who
actually cares about this stuff.
I, in effect, volunteered to do it when I pointed out that would solve
that issue on this list previously. I've setup replication with other
RDMSes, never mysql... but there is already a configured instance to
work from. I don't even see how this could be even considered a
challenge,... it's the ongoing maintenance that carries the real
Neither the e.V. nor Kate made any particular attempt
to involve the
other Wikimedia system administrators in this project from its
conception. I was certainly sceptical about zedler's value as a tool
server compared to the use we could have made of it as part of the core
cluster. I've now heard about one project that I'm interested in, and I
have an open mind about the rest, but you still have to make the case.
Specifically: how does your project benefit Wikipedia? Why should I
You should support it for the following reasons:
1) There are many projects on toolserver considered valuable by many
Wikipedia editors. And many of these projects are far less useful, or
not even possible, without realtime or near realtime database access.
(in particular, high speed access to the *links tables is very useful
and difficult to replace with anything available remotely)
2) The existence of the toolserver allows interested third parties to
perform work which would otherwise be requested of the developers.
Work which is of questionable value can be proven by those who care,
real workload on developers can be reduced, and more tasks which are
valuable but not valuable enough to justify interrupting core
developers can be completed.... and finally latency for such requests
3) Toolserver acts as an incubator to build qualified developers. I,
and several other people, now have a far more solid understanding of
the mediawiki database schema as a result of toolserver... I've
learned many things which wouldn't be important working on a static
dump but are important working on realtime data, and presumably the
real site. Many of the toolserver users are directly using portions
of mediawiki code, as well. It is intuitively obvious that this will
increase the number of people knowledgeable about mediawiki, and thus
benefit Wikimedia and mediawiki.
If you do not believe that toolserver is being effectively utilized,
then by all means begin plans to repurpose it. My response will to
simple to address Jimbo, Angela, and Anthere directly, remind them of
their questions which I used toolserver access to answer, and offer to
donate another comparable piece of hardware under the condition that
it be used exclusively for toolserver use and maintained under a
continued administrative plan I approve of... I have absolutely no
doubt that such a negotiation would reach an agreeable conclusion, as
toolserver is very widely found useful. Quite frankly, you will not be
able to stonewall requests to developers provide the minimal level of
support necessary to support this type of service.
It would be far more productive if rather than demanding we re-justify
the obvious need for toolserver, you offer to work with us to
determine a plan of action which will put the burden of the service
maximally on those interested in it, and minimally on those not
interested in it.
If it isn't clear, I'm more than completely willing to take a
substantial role in the ongoing maintenance of the existing
toolserver. However, I'm not the ideal candidate: I escaped running
Solaris boxes when I left the world of being a fulltime sysadmin, and
I don't particularly look forward to going back to it... and I am not
a fan of MySQL and find it particularly poorly suited for the sort of
queries we perform on toolserver (and this is demonstrably true, query
planning for anything with a subquery is poor enough to be considered
outright broken). Both of these factors will require me to learn
things I would otherwise not, but I am willing to do so because I am
not burdened by a disbelief in the importance of toolserver. Further,
my persistant and annoying questions about why we continue to work
around MySQL's shortcommings (for example, non-BMP UTF-8 support)
rather than at least evaluating other opensource RDBMSes, had clearly
caused tension between myself and, at least, our developers who are
otherwise employed by MySQL AB.