Hi there,
we are about to upgrade the kubernetes version that runs PAWS, from 1.6 to 1.17.
We don't expect any interruptions major on the service, perhaps only some
hiccups when pods are restarted/rescheduled.
More information is available in this phabricator ticket:
https://phabricator.wikimedia.org/T268669
The operation may take something between 30 minutes and 1 hours, and we are
starting soon after I finish sending this email.
Please, ping us if you see anything wrong.
regards.
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation
Every year or so the Cloud Services team tries to identify and clean up
unused projects and VMs. We do this via an opt-in process: anyone can
mark a project as 'in use,' and that project will be preserved for
another year.
I've created a wiki page the lists all existing projects, here:
https://wikitech.wikimedia.org/wiki/News/Cloud_VPS_2020_Purge
If you are a VPS user, please visit that page and mark any projects that
you use as {{Used}}. Note that it's not necessary for you to be a
project admin to mark something -- if you know that you're currently
using a resource and want to keep using it, go ahead and mark it
accordingly. If you /are/ a project admin, please take a moment to mark
which VMs are or aren't used in your projects.
When December arrives, I will shut down and begin the process of
reclaiming resources from unused projects.
If you think you use a VPS project but aren't sure which, I encourage
you to poke around on https://tools.wmflabs.org/openstack-browser/ to
see what looks familiar. Worst case, just email
cloud(a)lists.wikimedia.org with a description of your use case and we'll
sort it out there.
Exclusive toolforge users are free to ignore this task.
Thank you!
-Andrew and WMCS team
The ToolsDB service suffered a breakage in replication on 2020-10-27. WMCS has tried to restore replication of data, but that has been unsuccessful so far including doing a dump to rebuild replication without downtime.
At this point, we have a new server waiting to become the replica, but to start the replication process, we need to set the database to read-only for a full dump. This could easily take more than an hour. During that entire time, the database will be read-only.
We will begin at 1600 UTC and finish when it is done. The database is quite large, but, with it in read-only mode, I hope the backup will not take terribly long.
Please see https://phabricator.wikimedia.org/T266587 <https://phabricator.wikimedia.org/T266587> for additional information.
Brooke Storm
Staff SRE
Wikimedia Cloud Services
bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org>
IRC: bstorm
TLDR: Wiki Replicas' architecture is being redesigned for stability and
performance. Cross database JOINs will not be available and a host
connection will only allow querying its associated DB. See [1]
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign> for
more details.
Hi!
In the interest of making and keeping Wiki Replicas a stable and performant
service, a new backend architecture is needed. There is some impact in the
features and usage patterns.
What should I do? To avoid breaking changes, you can start making the
following changes *now*:
- Update existing tools to ensure queries are executed against the proper
database connection
- Eg: If you want to query the `eswiki_p` DB, you must connect to the
`eswiki.analytics.db.svc.eqiad.wmflabs` host and `eswiki_p` DB, and not to
enwiki or other hosts
- Check your existing tools and services queries for cross database JOINs,
rewrite the joins in application code
- Eg: If you are doing a join across databases, for example joining
`enwiki_p` and `eswiki_p`, you will need to query them separately, and
filter the results of the separate queries in the code
Timeline:
- November - December: Early adopter testing
- January 2021: Existing and new systems online, transition period starts
- February 2021: Old hardware is decommissioned
We need your help
- If you would like to beta test the new architecture, please let us know
and we will reach out to you soon
- Sharing examples / descriptions of how a tool or service was updated,
writing a common solution or some example code others can utilize and
reference, helping others on IRC and the mailing lists
If you have questions or need help adapting your code or queries, please
contact us [2]
<https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication>, or
write on the talk page [3]
<https://wikitech.wikimedia.org/wiki/Talk:News/Wiki_Replicas_2020_Redesign>.
We will be sending reminders, and more specific examples of the changes via
email and on the wiki page. For more information see [1]
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign>.
[1]: https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign
[2]: https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication
[3]:
https://wikitech.wikimedia.org/wiki/Talk:News/Wiki_Replicas_2020_Redesign
--
Joaquin Oltra Hernandez
Developer Advocate - Wikimedia Foundation
Hi!,
There will be a general CloudVPS network maintenance on 2020-10-09 @ 12:30 UTC.
The operation window will last for 1h. During the operation, all cloud services
will be inaccessible or intermittently down.
This operation affects all CloudVPS projects, including Toolforge, PAWS and
Quarry. Services running in the cloud might fail to contact external entities,
and connections to ToolsDB, NFS, wiki-replicas or LDAP will be affected as well.
The operation we are doing today is a followup to what we did two weeks ago [0],
and involves changing the IP addressing of the network that connects the
CloudVPS network to the internet.
Sorry for the short notice, we couldn't avoid scheduling this to today.
regards.
[0] https://phabricator.wikimedia.org/T265288
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation