(anonymous) wrote:
We currently have no plans for having the user databases on the same servers as the replicated databases. Direct joins will not be possible, so tools will need to be modified.
This is unfortunate, and a huge step backwards from the situation on the toolserver.
For example, the project I maintain on toolserver (the enwiki WP 1.0 assessment data) has user database tables with several million rows of data about articles, from which it needs to select the data for pages from fixed categories on the wiki, which themselves could have a few thousand members. The natural way to do this is to join against the categorylinks table. Any non-join solution is going to be much, much less efficient.
A key role of the toolserver setup was that it allowed these sorts of joins. Web hosting is cheap and data about the live wiki is already available in non-joinable form through the API with no replag.
Even more: If Labs replication isn't bound by Toolserver tradition, it would be *very* nice not to fragment the data according to the different WMF clusters, plus Commons or not, plus (separate) user databases or not, but have one cluster where users can join as logic suggests. As Toolser- ver merges Commons onto other clusters already, this seems to be possible with MySQL.
Tim