Cross-wiki JOINS are used by some of the queries we run regularly for
fawiki. One of those queries looks for articles that don't have an image in
their infobox in fawiki, but do have one on enwiki, so that we can
use/import that image. Another one JOINs fawiki data with commons data to
look for redundant images. Yet another one, looks for articles that all use
an image that doesn't exist (for cleanup purposes) but needs to join with
commons db because the referenced file might exist there. Lastly, we have a
report that looks for fair use images on fawiki that had the same name as
an image on enwiki where the enwiki copy was deleted; this usually
indicates in improper application of fair use, and enwiki -- due to its
larger community -- finds and deletes these faster than we could on fawiki.
There may be other cases I am unaware of. The point is, losing the
cross-wiki JOIN capability can make some of the above tasks really
difficult or completely impossible.
On Tue, Nov 10, 2020 at 3:27 PM Joaquin Oltra Hernandez <
jhernandez(a)wikimedia.org> wrote:
TLDR: Wiki Replicas' architecture is being
redesigned for stability and
performance. Cross database JOINs will not be available and a host
connection will only allow querying its associated DB. See [1]
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign>
for more details.
Hi!
In the interest of making and keeping Wiki Replicas a stable and
performant service, a new backend architecture is needed. There is some
impact in the features and usage patterns.
What should I do? To avoid breaking changes, you can start making the
following changes *now*:
- Update existing tools to ensure queries are executed against the proper
database connection
- Eg: If you want to query the `eswiki_p` DB, you must connect to the
`eswiki.analytics.db.svc.eqiad.wmflabs` host and `eswiki_p` DB, and not to
enwiki or other hosts
- Check your existing tools and services queries for cross database JOINs,
rewrite the joins in application code
- Eg: If you are doing a join across databases, for example joining
`enwiki_p` and `eswiki_p`, you will need to query them separately, and
filter the results of the separate queries in the code
Timeline:
- November - December: Early adopter testing
- January 2021: Existing and new systems online, transition period starts
- February 2021: Old hardware is decommissioned
We need your help
- If you would like to beta test the new architecture, please let us know
and we will reach out to you soon
- Sharing examples / descriptions of how a tool or service was updated,
writing a common solution or some example code others can utilize and
reference, helping others on IRC and the mailing lists
If you have questions or need help adapting your code or queries, please
contact us [2]
<https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication>,
or write on the talk page [3]
<https://wikitech.wikimedia.org/wiki/Talk:News/Wiki_Replicas_2020_Redesign>
.
We will be sending reminders, and more specific examples of the changes
via email and on the wiki page. For more information see [1]
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign>.
[1]:
https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign
[2]:
https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication
[3]:
https://wikitech.wikimedia.org/wiki/Talk:News/Wiki_Replicas_2020_Redesign
--
Joaquin Oltra Hernandez
Developer Advocate - Wikimedia Foundation
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce(a)lists.wikimedia.org (formerly
labs-announce(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud