On Thursday, July 25th, 2019 between the hours of 1500 and 1700 UTC we will
be performing system maintenance on the NFS servers that support Toolforge
and the CloudVPS instances that are using NFS for home, project or scratch
data.
During this maintenance window, we will be applying rolling updates that
require NFS service restarts. We are taking precautions to minimize impact,
but there may be short periods of NFS service interruption or performance
degradation.
---
Jason Hedden
Site Reliability Engineer - Wikimedia Cloud Services
Wikimedia Foundation
As part of routine networking and OS upgrades, I'll be emptying two
hypervisors (cloudvirt1016 and cloudvirt1017) on Monday and Tuesday, the
22nd and 23rd. This will result in downtime for many VMs as they are
copied and restarted. A complete list of affected instances follows.
I'll begin by moving the deployment-prep project at around 13:00
UTC on Monday. After that copies will proceed in roughly the order you
see below, but the timing will be hard to predict.
Please let me know if you need to schedule a more specific window
for your downtime. Better yet, if any of the listed VMs are defunct and
can simply be deleted, please do that now and save me some time!
-Andrew + the WMCS team
Affected instances (shown as <project>: <instance name>):
account-creation-assistance: accounts-appserver4
account-creation-assistance: accounts-mwoauth
automation-framework: af-puppetdb02
butterfly: butterfly-m4m2
cloudinfra: cloudinfra-db02
codereview: Krypton
codereview: Radon
commtech: commtech-2
community-labs-monitoring: clm-web-01
community-labs-monitoring: clm-worker-01
dashiki: dashiki-01
dashiki: dashiki-staging-01
deployment-prep: deployment-cache-text05
deployment-prep: deployment-changeprop
deployment-prep: deployment-chromium01
deployment-prep: deployment-chromium02
deployment-prep: deployment-cpjobqueue
deployment-prep: deployment-dumps-puppetmaster02
deployment-prep: deployment-elastic06
deployment-prep: deployment-elastic07
deployment-prep: deployment-etcd-01
deployment-prep: deployment-eventlog05
deployment-prep: deployment-imagescaler01
deployment-prep: deployment-imagescaler02
deployment-prep: deployment-ircd
deployment-prep: deployment-jobrunner03
deployment-prep: deployment-kafka-jumbo-1
deployment-prep: deployment-logstash2
deployment-prep: deployment-mediawiki-07
deployment-prep: deployment-memc06
deployment-prep: deployment-memc07
deployment-prep: deployment-mwmaint01
deployment-prep: deployment-ores01
deployment-prep: deployment-puppetdb02
deployment-prep: deployment-puppetmaster03
deployment-prep: deployment-restbase01
deployment-prep: deployment-restbase02
deployment-prep: deployment-sentry01
deployment-prep: deployment-snapshot01
deployment-prep: deployment-urldownloader02
deployment-prep: deployment-zookeeper02
design: design-research-methods
dumps: dumps-0
dwl: dwl
dwl: taxonbota
fa-wp: tofawiki02
getstarted: gitservices
getstarted: webservices
glampipe: Glampipe
hound: hound-puppet-02
integration: integration-cumin
integration: integration-r-lang-01
integration: integration-slave-docker-1040
integration: integration-slave-docker-1041
integration: integration-slave-jessie-1002
k8splay: k8s-dzahn
lizenzhinweisgenerator: lizenzhinweisgenerator
maps: maps-tiles1
maps: maps-warper3
openrefine: openrefine01
openstack: cloud-bootstrapvz-stretch
otrs: otrs-oneclickspam-test
packagist-mirror: packagist-mirror1
partnermetrics: partnermetrics-redis-01
puppet-diffs: compiler1001
qna: meza-new2
quotatest: novaadminmadethis6
reading-web-staging: readers-web-master
recommendation-api: missing-sections
recommendation-api: rec-wiki
recommendation-api: related-articles
recommendation-api: tool
security-tools: logparse01
sentry: frama-test5
sentry: frama-test6-sb
services: kask
services: kask-client
shinken: shinken-02
shiny-r: discovery-production-02
testlabs: abogott-puppetmaster
testlabs: canary1016-01
tools: tools-sgecron-01
tools: tools-sgegrid-shadow
toolsbeta: toolsbeta-sgecron-01
toolsbeta: toolsbeta-sgegrid-shadow
toolsbeta: toolsbeta-sgewebgrid-lighttpd-0901
twl: wmil
video: encoding01
video: gfg01
video: video-redis
video: videodev
videowiki: app-instance
visualeditor: dumpgrepper
webperf: disposable
wikidata-dev: wikidata-constraints
wikidata-federation: federated-commons
wikidata-federation: federated-wikidata
wikidiff2-wmde-dev: wmde-wikidiff2-jacnth
wikidocumentaries: hupu
wikidocumentaries: roope
wikidumpparse: whgi
wikifactmine: elasticsearch-20
wikifactmine: elasticsearch-21
wikifactmine: puppetmaster-01
wikilabels: wikilabels-02
wikilabels: wikilabels-experiment
wikimetrics: wikimetrics-01
wikistream: ws-web
wikitextexp: wikitextexp-base-1002
wikitextexp: wikitextexp-expt-1002
wm-bot: wm-bot-pg
wm-bot: wm-bot2
wmf-research-tools: diegoTest
wmf-research-tools: wikilabels
wpx: wpx-redirects-01
On Friday I'll be moving the toolforge cron server to new hardware.
During the move, any uses of the 'crontab' command will fail
gracelessly. Any cron jobs scheduled to launch during the downtime will
be skipped.
The move should take 5-10 minutes but may take as long as 30 if there
are complications.
-Andrew
The latest Debian version, 10.0 "buster", was officially released a
few days ago[0]. Today, I've built a new Debian buster base image and
made it available in all projects.
The Stretch base image will remain available for some time to
permit compatibility with existing setups, but any new work should use
buster. If you have existing instances that were built using the buster
prelease images (or instances that you manually upgraded from an earlier
version) I encourage you to delete them and rebuild with this new base
image for the most consistent results.
-Andrew
Cross-posting from the mediawiki-api-announce(a)lists.wikimedia.org list.
---------- Forwarded message ---------
From: Brad Jorsch (Anomie) <bjorsch(a)wikimedia.org>
Date: Fri, Jun 21, 2019 at 8:30 AM
Subject: [Mediawiki-api-announce] BREAKING CHANGE: Improved timestamp support
To: <mediawiki-api-announce(a)lists.wikimedia.org>
An upgrade to the timestamp library used by MediaWiki is resulting in
two changes to the handling of timestamp inputs to the action API.
There will be no change to timestamps output by the API.
All of these changes should be deployed to Wikimedia wikis with 1.34.0-wmf.10.
Historically MediaWiki has ignored timezones in supported formats that
include timestamps, treating them as if the timezone specified were
UTC. In the future, specified timezones will be honored (and converted
to UTC).
Historically some invalid formats were accepted, such as
"2019-05-22T12:00:00.....1257" or "Wed, 22 May 2019 12:00:00 A
potato". Due to improved validation, these will no longer be accepted.
Support for ISO 8601 and other formats has also been improved. See
https://www.mediawiki.org/wiki/Timestamp for details on the formats
that will be supported.
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
The actor and comment views on the wiki replicas are slowed by a need
to make subqueries against 8 other tables in order to determine
which rows should and should not be visible on the replica service. With
recent changes to the replica view schema, this problem has become much more visible.
The WMCS team has deployed a set of specialized views of these two tables that
will allow individual queries to only be slowed by a single subquery against
a related target, eg. a query for an actor mentioned in the log_actor field of
the logging table could be made against actor_logging, which will only check against
logic in the actor table--not 7 other tables that aren't related to the query.
On the flip side the actor_logging view will only have rows that are exposed
in the logging table.
For more information, see: https://wikitech.wikimedia.org/wiki/News/Actor_storage_changes_on_the_Wiki_…
If other documentation about the Wiki Replicas on wikitech needs updating related to this change, we would
like your help finding it! Please let us know on IRC, phab task, email or on wiki if you find things that need
updating related to the actor and comment tables. A Phabricator task is already open to update the MediaWiki
documentation related (https://phabricator.wikimedia.org/T225007), but it is likely that there are bits around wikitech
to update as well.
Brooke Storm
Operations Engineer
Wikimedia Cloud Services
bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org>
IRC: bstorm_
Similar to the earlier removal of text fields from the wiki replicas for comment storage refactors in Mediawiki, we are going to remove “user text” columns from the views that are deprecated in the Mediawiki schema to prepare for when they will actually be removed upstream. The column drops are tracked and explained here https://phabricator.wikimedia.org/T223406 <https://phabricator.wikimedia.org/T223406>. The tables with names such as <tablename>_compat will not see a difference in structure. The change is scheduled for Monday, May 27th.
The fields that are dropping from the views are:
revision: rev_user and rev_user_text.
archive: ar_user and ar_user_text.
ipblocks: ipb_by and ipb_by_text.
image: img_user and img_user_text.
oldimage: oi_user and oi_user_text.
filearchive: fa_user and fa_user_text.
recentchanges: rc_user and rc_user_text.
logging: log_user and log_user_text.
Ideally, tools that connect to the replicas should gather the information from the appropriate entries in the actor table instead, again, this is similar to the change for the comment table. The data is already there for you to start using. The alternative is to try using the related <tablename>_compat table, which won’t be changing in a user-visible way at this time.
Brooke Storm
Operations Engineer
Wikimedia Cloud Services
bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org>
IRC: bstorm_
Hi!
On 2019-06-03 UTC+2 14:00 (next monday) we will be rebuilding the
cloudservices1003 server,
that holds the designate service which serves DNS request for CloudVPS and
Toolforge.
We have a backup server -cloudservices1004-, so we don't expect a lot of
downtime. But DNS queries are really fast, and there may be a lot of them that
will fail while we stabilize the DNS service.
Please reach out to the WMCS team if you need more details or have any doubts.
regards.
--
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation
As part of the efforts to retire labstore1003 and move to more modern hardware with some redundancy, we will begin the process of switching the mounts for /data/scratch to the new server starting 2019-05-28. It is advised to not use the /data/scratch NFS mount during the maintenance, starting next Tuesday at 1800 UTC until after it is announced over, which should be around an hour, since the mount will be changing location and will be generally somewhat unstable.
NFS changes have occasionally caused more holistic problems within Toolforge in the past, but this should be fairly low-impact since it isn’t affecting /data/project or /home.
Brooke Storm
Operations Engineer
Wikimedia Cloud Services
bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org>
IRC: bstorm_
Good news from the Wikimedia Hackathon in Prague! We now have some
newer language runtimes for Node.js and Python3 available for
Kubernetes webservices. These newer versions match the versions that
were added for grid engine webservices when we upgraded to Debian
Stretch.
These new versions are available in parallel with the older Node.js
6.11 and Python 3.4 versions. This will be the pattern used in the
future when we add all newer language runtime versions so that
migrations are a bit easier for all existing users. The new type names
are:
* node10
* python3.5
== Node.js 10 ==
$ webservice --backend=kubernetes node10 shell
Defaulting container name to interactive.
Use 'kubectl describe pod/interactive -n bd808-test' to see all of
the containers in this pod.
If you don't see a command prompt, try pressing enter.
$ nodejs --version
v10.4.0
$ npm --version
6.5.0
$ logout
Session ended, resume using 'kubectl attach interactive -c
interactive -i -t' command when the pod is running
Pod stopped. Session cannot be resumed.
== Python 3.5 ==
$ webservice --backend=kubernetes python3.5 shell
Defaulting container name to interactive.
Use 'kubectl describe pod/interactive -n bd808-test' to see all of
the containers in this pod.
If you don't see a command prompt, try pressing enter.
$ python3 --version
Python 3.5.3
$ logout
Session ended, resume using 'kubectl attach interactive -c
interactive -i -t' command when the pod is running
Pod stopped. Session cannot be resumed.
Bryan, on behalf of the Toolforge admin team
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Technical Engagement Boise, ID USA
irc: bd808 v:415.839.6885 x6855