On 3/29/06, Tim Starling tstarling@wikimedia.org wrote: [quoted out of order]
Who here needs more than 5 requests per second? Who needs a latency of less than a few hundred milliseconds? What exactly do you want full text replication for?
5 requests per second isn't enough to keep up with reading 2-6 full texts per recent change on enwiki which is what my bad edit detection tool does (depending on the content, more than two because it will look deep enough to help people fully revert contiguous multi-edit vandalism).
Enwiki has a bit over 1.2 main namespace edits per second on a 24 hour average. I need to keep up with the instanious rate on a 1 minute-ish window, which is much greater, if I don't want the vandalism checking to lag annoyingly far behind realtime.
This ran with negligible impact on toolserver when we still had text there...
[snip]
It would be easier if we had a VLAN, so that we didn't have to set up 5 ssh tunnels. Does anyone know anything about VLANs? Does anyone care enough about this project to research it?
That's a subject area well within my professional skillset, I'd be more than willing to look into it.
[snip]
Why are the list archives private? Why isn't it listed on the mail.wikipedia.org index page? Why isn't it in gmane? Why have I never heard of it before? Gregory Maxwell has been whinging that nobody is listening to him on a list that nobody can read.
Well it wasn't just a whine, I did repeat my offer to help. I also CCed wikidev list, though I can understand why a single request there would be lost.
Nor were my requests placed in both the wikimedia-dev irc channel nor the toolserver IRC channel in an inaccessible place.
[snip]
It should be possible to set up 5 MySQL instances and have each of them replicating from a different master. Is anyone volunteering to set up those instances? Maybe we need to give root access to someone who actually cares about this stuff.
I, in effect, volunteered to do it when I pointed out that would solve that issue on this list previously. I've setup replication with other RDMSes, never mysql... but there is already a configured instance to work from. I don't even see how this could be even considered a challenge,... it's the ongoing maintenance that carries the real burden.
Neither the e.V. nor Kate made any particular attempt to involve the other Wikimedia system administrators in this project from its conception. I was certainly sceptical about zedler's value as a tool server compared to the use we could have made of it as part of the core cluster. I've now heard about one project that I'm interested in, and I have an open mind about the rest, but you still have to make the case. Specifically: how does your project benefit Wikipedia? Why should I support it?
You should support it for the following reasons: 1) There are many projects on toolserver considered valuable by many Wikipedia editors. And many of these projects are far less useful, or not even possible, without realtime or near realtime database access. (in particular, high speed access to the *links tables is very useful and difficult to replace with anything available remotely) 2) The existence of the toolserver allows interested third parties to perform work which would otherwise be requested of the developers. Work which is of questionable value can be proven by those who care, real workload on developers can be reduced, and more tasks which are valuable but not valuable enough to justify interrupting core developers can be completed.... and finally latency for such requests are reduced. 3) Toolserver acts as an incubator to build qualified developers. I, and several other people, now have a far more solid understanding of the mediawiki database schema as a result of toolserver... I've learned many things which wouldn't be important working on a static dump but are important working on realtime data, and presumably the real site. Many of the toolserver users are directly using portions of mediawiki code, as well. It is intuitively obvious that this will increase the number of people knowledgeable about mediawiki, and thus benefit Wikimedia and mediawiki.
If you do not believe that toolserver is being effectively utilized, then by all means begin plans to repurpose it. My response will to simple to address Jimbo, Angela, and Anthere directly, remind them of their questions which I used toolserver access to answer, and offer to donate another comparable piece of hardware under the condition that it be used exclusively for toolserver use and maintained under a continued administrative plan I approve of... I have absolutely no doubt that such a negotiation would reach an agreeable conclusion, as toolserver is very widely found useful. Quite frankly, you will not be able to stonewall requests to developers provide the minimal level of support necessary to support this type of service.
It would be far more productive if rather than demanding we re-justify the obvious need for toolserver, you offer to work with us to determine a plan of action which will put the burden of the service maximally on those interested in it, and minimally on those not interested in it.
If it isn't clear, I'm more than completely willing to take a substantial role in the ongoing maintenance of the existing toolserver. However, I'm not the ideal candidate: I escaped running Solaris boxes when I left the world of being a fulltime sysadmin, and I don't particularly look forward to going back to it... and I am not a fan of MySQL and find it particularly poorly suited for the sort of queries we perform on toolserver (and this is demonstrably true, query planning for anything with a subquery is poor enough to be considered outright broken). Both of these factors will require me to learn things I would otherwise not, but I am willing to do so because I am not burdened by a disbelief in the importance of toolserver. Further, my persistant and annoying questions about why we continue to work around MySQL's shortcommings (for example, non-BMP UTF-8 support) rather than at least evaluating other opensource RDBMSes, had clearly caused tension between myself and, at least, our developers who are otherwise employed by MySQL AB.