Hi! Most tools query just a single db at a time, so I don't think this will be a massive problem. However some such as Global Contribs[0] and GUC[1] can theoretically query all of them from a single request. Creating new connections on-the-fly seems doable in production, the issue is how to work on these tools in a local environment. Currently the recommendation is to use a SSH tunnel to the desired host,[2] such as enwiki.analytics.db.svc.eqiad.wmflabs. Surely we can't do this same port forwarding for 900+ connections.

Any ideas? Perhaps there's some way to make a host that automatically forwards to the correct one, solely for developer use? Or will development of such global tools need to happen in the Cloud Services environment?

~ MA

[0] https://xtools.wmflabs.org/globalcontribs
[1] https://guc.toolforge.org/
[2] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#SSH_tunneling_for_local_testing_which_makes_use_of_Wiki_Replica_databases

On Tue, Nov 10, 2020 at 3:26 PM Joaquin Oltra Hernandez <jhernandez@wikimedia.org> wrote:
TLDR: Wiki Replicas' architecture is being redesigned for stability and performance. Cross database JOINs will not be available and a host connection will only allow querying its associated DB. See [1] for more details.

Hi!

In the interest of making and keeping Wiki Replicas a stable and performant service, a new backend architecture is needed. There is some impact in the features and usage patterns.

What should I do? To avoid breaking changes, you can start making the following changes *now*:
- Update existing tools to ensure queries are executed against the proper database connection
  - Eg: If you want to query the `eswiki_p` DB, you must connect to the `eswiki.analytics.db.svc.eqiad.wmflabs` host and `eswiki_p` DB, and not to enwiki or other hosts
- Check your existing tools and services queries for cross database JOINs, rewrite the joins in application code
  - Eg: If you are doing a join across databases, for example joining `enwiki_p` and `eswiki_p`, you will need to query them separately, and filter the results of the separate queries in the code

Timeline:
- November - December: Early adopter testing
- January 2021: Existing and new systems online, transition period starts
- February 2021: Old hardware is decommissioned

We need your help
- If you would like to beta test the new architecture, please let us know and we will reach out to you soon
- Sharing examples / descriptions of how a tool or service was updated, writing a common solution or some example code others can utilize and reference, helping others on IRC and the mailing lists

If you have questions or need help adapting your code or queries, please contact us [2], or write on the talk page [3].

We will be sending reminders, and more specific examples of the changes via email and on the wiki page. For more information see [1].

[1]: https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign
[2]: https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication
[3]: https://wikitech.wikimedia.org/wiki/Talk:News/Wiki_Replicas_2020_Redesign

--
Joaquin Oltra Hernandez
Developer Advocate - Wikimedia Foundation
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce@lists.wikimedia.org (formerly labs-announce@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce