> ... As far as I could see in the docs, connection reusing and cross-DB joins are not documented or advertised ...   

Not sure what you are talking about. The cross-DB joins were key features of ToolServer[1] which Wikimedia Labs replaced in 2012-2014 and those were just not as initial features of Wikimedia Labs db replica but also they were mandatory features for the transition as tools at the time were depending on them.[2] There also used to be some level documentation written by WMF tech on the wiki too based on the initial configuration ticket.[3]

[1] https://meta.wikimedia.org/wiki/Toolserver
[2] https://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Migration_of_Toolserver_tools#Will_I_be_able_to_join_user_databases_with_wiki_ones?
[3] https://static-bugzilla.wikimedia.org/show_bug.cgi?id=57876

Br,
-- Kimmo Virtanen, Zache

On Mon, Nov 16, 2020 at 10:43 PM Joaquin Oltra Hernandez <jhernandez@wikimedia.org> wrote:
Hi Maarten,

I believe this work started many years ago, and it was paused, and recently restarted because of the stability and performance problems in the last years. Breaking changes are always painful, in this case of the replicas I think the changes follow the recommendations laid out years ago. As far as I could see in the docs, connection reusing and cross-DB joins are not documented or advertised. The fact that they work is an implementation detail that has been useful but with the amount of data we have makes the service unstable, slow, and very hard to maintain. For example, people often report issues when looking at replag, and here are some examples of recent instability and crashes due to the current architecture and usage.

I'm sorry about the extra work this will cause, I hope the improved stability and performance will make it worth it for you, and that you will reconsider and migrate your code to work on the new architecture (or reach out for specific help if you need it). Your experience and examples would be very helpful for other developers in the community.

On Wed, Nov 11, 2020 at 10:11 PM Maarten Dammers <maarten@mdammers.nl> wrote:

Hi Joaquin,

On 10-11-2020 21:26, Joaquin Oltra Hernandez wrote:
TLDR: Wiki Replicas' architecture is being redesigned for stability and performance. Cross database JOINs will not be available and a host connection will only allow querying its associated DB. See [1] for more details.

If you only think of Wikipedia, not a lot will break probably, but if you take into account Commons and Wikidata a lot will break. A quick grep in my folder with Commons queries returns 123 lines with cross database joins. So yes, stuff will break and tools will be abandoned. This follows the practice that seems to have become standard for the WMF these days: Decisions are made with a small group within the WMF without any community involved. Only after the decision has been made, it's announced.

Unhappy and disappointed,

Maarten

_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


--
Joaquin Oltra Hernandez
Developer Advocate - Wikimedia Foundation
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud