Hi.
I've been a big proponent of the idea that we should work toward the creation of multiple Toolservers. Redundancy in this area would be a good thing. Setting up infrastructure that allows any person with an AWS account or extra server capacity to set up and run their own full Toolserver (that is, set up real-time database replication + shared hosting) would be great.
That said, I think it would be foolish to ignore the fact that the German Toolserver is slowly dying and that Wikimedia Labs is its natural successor. There was an IRC office hours earlier today and my takeaway from it was that Labs is on the right path, but that it isn't marketing itself well.
Lately, the Toolserver has been horribly unstable. Web tools stop working, queries are killed, databases are corrupt, replication lag hurts, and everything feels overloaded and tired. Labs should better advertise itself as a robust, stable hosting platform. Stability is important to developers with limited time, who don’t want to spend their time debugging hosting issues. The Toolserver had a tendency to make breaking changes (changing the operating system, changing the type of Web server, killing and then resuscitating support for Vixie cron, requiring the use of SGE, etc.). Labs should implement and emphasize a better approach to these hosting issues, to garner adopters and supporters.
Database replication came up yet again in the office hours. Many developers (myself included) seem to be holding off on Labs until database replication is up and running. The sooner this can happen, the better. But the remaining sticking point seems to be cross-database joins, which people in the office hours suggested using federated tables or application logic to replace. It would help if the Labs folks could better explain _why_ cross-database joins won't be supported (I think most developers would agree with the reasoning) and offer better guidance and documentation for how to work around this hurdle. (For example, what is a federated table?)
Yes, there will always be some contingent that's upset at the seemingly underhanded way in which Labs was brought into the world (simply announced one day, with the implicit acceptance that the German Toolserver would be put down in time or left to rot), but I think there are far more people who don't support or are ambivalent toward Labs when these people could and should be proponents for it. Labs just needs a bit better public relations, in my opinion.
MZMcBride
On Wed, May 1, 2013 at 3:02 AM, MZMcBride z@mzmcbride.com wrote:
(For example, what is a federated table?)
http://dev.mysql.com/doc/en/federated-storage-engine.html
I don't know what peter/asher had planned for the new Maria world. http://serge.frezefond.com/2013/04/mariadb-connect-storage-engine-vs-federat...
-Jeremy
On 04/30/2013 11:10 PM, Jeremy Baron wrote:
I don't know what peter/asher had planned for the new Maria world.
Yes, the fact that there are a couple of near-equivalent ways of doing it is part of the reason why we haven't been all that explicit yet -- I want to do some testing first to find the best solution before we encourage people to use it; there's little point in having everyone start doing it one way just to come back a few weeks later to announce "well, we're doing this another way after all". :-)
-- Marc
On 04/30/2013 11:02 PM, MZMcBride wrote:
It would help if the Labs folks could better explain _why_ cross-database joins won't be supported (I think most developers would agree with the reasoning) and offer better guidance and documentation for how to work around this hurdle. (For example, what is a federated table?)
That's a very good point, MZM. That said, no small part of the reason we haven't given a great deal of explanation on how the alternatives will work is that we're not entirely certain what for those alternatives will take yet - at least in the detail.
I think you're right that the developper community needs at least /some/ explanation sooner rather than later; even if it's a big picture that's lacking in detail. I'll try to write up something this week on that topic.
-- Marc
On Tue, Apr 30, 2013 at 8:02 PM, MZMcBride z@mzmcbride.com wrote:
Yes, there will always be some contingent that's upset at the seemingly
underhanded way in which Labs was brought into the world (simply announced one day, with the implicit acceptance that the German Toolserver would be put down in time or left to rot), but I think there are far more people who don't support or are ambivalent toward Labs when these people could and should be proponents for it. Labs just needs a bit better public relations, in my opinion.
Just because the events coincided didn't mean they were connected. You've simply assumed this (in bad faith).
WMF wanted a virtual environment for testing and development during the usability initiative. Tesla was the first iteration of that. I was hired as full staff shortly afterward as an operation engineer to build a virtual cluster. I very heavily disliked creating virtual machines for people, then configuring them, setting up user accounts, ssh keys, rights, etc.. I wanted people to do that themselves. That's when I started looking at some clustering services that allowed self-service provisioning. The multi-tenancy of OpenStack is what started forming my ideas for Labs.
When I started working on it, I decided that this would be something really useful to have open to the entire community. The multi-tenancy meant we could have communities run different projects; it felt like how we run the content portion of our community. I especially wanted this because I didn't like that we were unable to give shell/root to community members and wanted a way for us to eventually allow that again.
Toolserver wasn't even a consideration of mine. Using Labs to replace TS wasn't really an idea till a short time before the beta launch at the New Orleans hackathon. The work that's been concentrated on so far was my original roadmap for Labs and we're just now starting to hit the TS feature sets.
Since the really poor initial reaction to the deprecation of TS, I've avoided doing any marketing whatsoever, because it tended to cause flamewars. Instead I worked on stabilizing Labs and starting on the necessary features to enable tool labs. Thankfully Coren has been able to work on this fully, which has greatly accelerated the schedule.
We now have Silke and Sumana working with everyone, so I feel the marketing aspect will improve. We have "move your tool/bot" workshops running the entirety of the Amsterdam Hackathon and the Wikimania Hackathon, the latter being a mass migration hackathon (hopefully).
- Ryan
Hello, At Wednesday 01 May 2013 14:46:08 DaB. wrote:
The Toolserver had a tendency to make breaking changes
sorry, I have to disagree here. We are very backwards-compatible. We still support a Solaris-login-server just because people are too lazzy to convert their stuff to Linux, we still support 2 variants of cron because the users can not decide which one to use. We have let people run non-SGE-task 2 YEARS after River announced the usage of SGE. If our little changes are too much for you, than a moving to Labs will be out of question for you.
Sincerely, DaB.
(anonymous) wrote:
[...]
Database replication came up yet again in the office hours. Many developers (myself included) seem to be holding off on Labs until database replication is up and running. The sooner this can happen, the better. But the remaining sticking point seems to be cross-database joins, which people in the office hours suggested using federated tables or application logic to replace. It would help if the Labs folks could better explain _why_ cross-database joins won't be supported (I think most developers would agree with the reasoning) and offer better guidance and documentation for how to work around this hurdle. (For example, what is a federated table?)
[...]
The problem with this decision is the effort spent and the insincerity. If database replication for Labs would have meant moving some dbxxx servers to labsdbxxx, adding the views existing on Toolserver and tightening some firewall rules, it could have been set up in a month, and any moaning about having to use federated tables would have been si- lenced by the minutes it would take to add another server to the cluster to increase performance compared to the years it takes at Toolserver.
Or there could have been some new concept like Galera men- tioned by Nosy that eases maintenance because it is not some sparsely documented Solaris thingy in River style.
But now the plan is to have two clusters (PreLabsDBDBS and LabsDB), use triggers to remove data and then (addition- ally!) views, end up with less functionality than the Toolserver while gaining nothing, and all that takes half a year to set up in a cloak-and-dagger way while publicly the need for cross-database JOINs is acknowledged.
Tim
toolserver-l@lists.wikimedia.org