... As far as I could see in the docs, connection
reusing and cross-DB
joins are not documented or advertised ...
Not sure what you are talking about. The cross-DB joins were key features
of ToolServer[1] which Wikimedia Labs replaced in 2012-2014 and those were
just not as initial features of Wikimedia Labs db replica but also they
were mandatory features for the transition as tools at the time were
depending on them.[2] There also used to be some level documentation
written by WMF tech on the wiki too based on the initial configuration
ticket.[3]
[1]
Br,
-- Kimmo Virtanen, Zache
On Mon, Nov 16, 2020 at 10:43 PM Joaquin Oltra Hernandez <
jhernandez(a)wikimedia.org> wrote:
Hi Maarten,
I believe this work started many years ago, and it was paused, and
recently restarted because of the stability and performance problems in the
last years. Breaking changes are always painful, in this case of the
replicas I think the changes follow the recommendations laid out years ago.
As far as I could see in the docs, connection reusing and cross-DB joins
are not documented or advertised. The fact that they work is an
implementation detail that has been useful but with the amount of data we
have makes the service unstable, slow, and very hard to maintain. For
example, people often report issues when looking at replag
<https://replag.toolforge.org/>, and here
<https://phabricator.wikimedia.org/search/query/vzOgtuG0eo.n/#R> are some
examples of recent instability and crashes due to the current architecture
and usage.
I'm sorry about the extra work this will cause, I hope the improved
stability and performance will make it worth it for you, and that you will
reconsider and migrate your code to work on the new architecture (or reach
out for specific help if you need it). Your experience and examples would
be very helpful for other developers in the community.
On Wed, Nov 11, 2020 at 10:11 PM Maarten Dammers <maarten(a)mdammers.nl>
wrote:
Hi Joaquin,
On 10-11-2020 21:26, Joaquin Oltra Hernandez wrote:
TLDR: Wiki Replicas' architecture is being redesigned for stability and
performance. Cross database JOINs will not be available and a host
connection will only allow querying its associated DB. See [1]
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign>
for more details.
If you only think of Wikipedia, not a lot will break probably, but if you
take into account Commons and Wikidata a lot will break. A quick grep in my
folder with Commons queries returns 123 lines with cross database joins. So
yes, stuff will break and tools will be abandoned. This follows the
practice that seems to have become standard for the WMF these days:
Decisions are made with a small group within the WMF without any community
involved. Only after the decision has been made, it's announced.
Unhappy and disappointed,
Maarten
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud
--
Joaquin Oltra Hernandez
Developer Advocate - Wikimedia Foundation
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud