- Cloud-announce - lists.wikimedia.org

Instance downtime on August 22nd and 23rd
by Andrew Bogott 22 Jul '19

22 Jul '19

As part of routine networking and OS upgrades, I'll be emptying two hypervisors (cloudvirt1016 and cloudvirt1017) on Monday and Tuesday, the 22nd and 23rd. This will result in downtime for many VMs as they are copied and restarted. A complete list of affected instances follows. I'll begin by moving the deployment-prep project at around 13:00 UTC on Monday. After that copies will proceed in roughly the order you see below, but the timing will be hard to predict. Please let me know if you need to schedule a more specific window for your downtime. Better yet, if any of the listed VMs are defunct and can simply be deleted, please do that now and save me some time! -Andrew + the WMCS team Affected instances (shown as <project>: <instance name>): account-creation-assistance: accounts-appserver4 account-creation-assistance: accounts-mwoauth automation-framework: af-puppetdb02 butterfly: butterfly-m4m2 cloudinfra: cloudinfra-db02 codereview: Krypton codereview: Radon commtech: commtech-2 community-labs-monitoring: clm-web-01 community-labs-monitoring: clm-worker-01 dashiki: dashiki-01 dashiki: dashiki-staging-01 deployment-prep: deployment-cache-text05 deployment-prep: deployment-changeprop deployment-prep: deployment-chromium01 deployment-prep: deployment-chromium02 deployment-prep: deployment-cpjobqueue deployment-prep: deployment-dumps-puppetmaster02 deployment-prep: deployment-elastic06 deployment-prep: deployment-elastic07 deployment-prep: deployment-etcd-01 deployment-prep: deployment-eventlog05 deployment-prep: deployment-imagescaler01 deployment-prep: deployment-imagescaler02 deployment-prep: deployment-ircd deployment-prep: deployment-jobrunner03 deployment-prep: deployment-kafka-jumbo-1 deployment-prep: deployment-logstash2 deployment-prep: deployment-mediawiki-07 deployment-prep: deployment-memc06 deployment-prep: deployment-memc07 deployment-prep: deployment-mwmaint01 deployment-prep: deployment-ores01 deployment-prep: deployment-puppetdb02 deployment-prep: deployment-puppetmaster03 deployment-prep: deployment-restbase01 deployment-prep: deployment-restbase02 deployment-prep: deployment-sentry01 deployment-prep: deployment-snapshot01 deployment-prep: deployment-urldownloader02 deployment-prep: deployment-zookeeper02 design: design-research-methods dumps: dumps-0 dwl: dwl dwl: taxonbota fa-wp: tofawiki02 getstarted: gitservices getstarted: webservices glampipe: Glampipe hound: hound-puppet-02 integration: integration-cumin integration: integration-r-lang-01 integration: integration-slave-docker-1040 integration: integration-slave-docker-1041 integration: integration-slave-jessie-1002 k8splay: k8s-dzahn lizenzhinweisgenerator: lizenzhinweisgenerator maps: maps-tiles1 maps: maps-warper3 openrefine: openrefine01 openstack: cloud-bootstrapvz-stretch otrs: otrs-oneclickspam-test packagist-mirror: packagist-mirror1 partnermetrics: partnermetrics-redis-01 puppet-diffs: compiler1001 qna: meza-new2 quotatest: novaadminmadethis6 reading-web-staging: readers-web-master recommendation-api: missing-sections recommendation-api: rec-wiki recommendation-api: related-articles recommendation-api: tool security-tools: logparse01 sentry: frama-test5 sentry: frama-test6-sb services: kask services: kask-client shinken: shinken-02 shiny-r: discovery-production-02 testlabs: abogott-puppetmaster testlabs: canary1016-01 tools: tools-sgecron-01 tools: tools-sgegrid-shadow toolsbeta: toolsbeta-sgecron-01 toolsbeta: toolsbeta-sgegrid-shadow toolsbeta: toolsbeta-sgewebgrid-lighttpd-0901 twl: wmil video: encoding01 video: gfg01 video: video-redis video: videodev videowiki: app-instance visualeditor: dumpgrepper webperf: disposable wikidata-dev: wikidata-constraints wikidata-federation: federated-commons wikidata-federation: federated-wikidata wikidiff2-wmde-dev: wmde-wikidiff2-jacnth wikidocumentaries: hupu wikidocumentaries: roope wikidumpparse: whgi wikifactmine: elasticsearch-20 wikifactmine: elasticsearch-21 wikifactmine: puppetmaster-01 wikilabels: wikilabels-02 wikilabels: wikilabels-experiment wikimetrics: wikimetrics-01 wikistream: ws-web wikitextexp: wikitextexp-base-1002 wikitextexp: wikitextexp-expt-1002 wm-bot: wm-bot-pg wm-bot: wm-bot2 wmf-research-tools: diegoTest wmf-research-tools: wikilabels wpx: wpx-redirects-01

1 3

Brief toolforge cron outage on Friday, 2019-07-19 at 14:00 UTC
by Andrew Bogott 19 Jul '19

19 Jul '19

On Friday I'll be moving the toolforge cron server to new hardware. During the move, any uses of the 'crontab' command will fail gracelessly. Any cron jobs scheduled to launch during the downtime will be skipped. The move should take 5-10 minutes but may take as long as 30 if there are complications. -Andrew

1 1

Debian Buster image now available in cloud-vps
by Andrew Bogott 09 Jul '19

09 Jul '19

The latest Debian version, 10.0 "buster", was officially released a few days ago[0]. Today, I've built a new Debian buster base image and made it available in all projects. The Stretch base image will remain available for some time to permit compatibility with existing setups, but any new work should use buster. If you have existing instances that were built using the buster prelease images (or instances that you manually upgraded from an earlier version) I encourage you to delete them and rebuild with this new base image for the most consistent results. -Andrew

1 0

[Mediawiki-api-announce] BREAKING CHANGE: Improved timestamp support
by Bryan Davis 21 Jun '19

21 Jun '19

Cross-posting from the mediawiki-api-announce(a)lists.wikimedia.org list. ---------- Forwarded message --------- From: Brad Jorsch (Anomie) <bjorsch(a)wikimedia.org> Date: Fri, Jun 21, 2019 at 8:30 AM Subject: [Mediawiki-api-announce] BREAKING CHANGE: Improved timestamp support To: <mediawiki-api-announce(a)lists.wikimedia.org> An upgrade to the timestamp library used by MediaWiki is resulting in two changes to the handling of timestamp inputs to the action API. There will be no change to timestamps output by the API. All of these changes should be deployed to Wikimedia wikis with 1.34.0-wmf.10. Historically MediaWiki has ignored timezones in supported formats that include timestamps, treating them as if the timezone specified were UTC. In the future, specified timezones will be honored (and converted to UTC). Historically some invalid formats were accepted, such as "2019-05-22T12:00:00.....1257" or "Wed, 22 May 2019 12:00:00 A potato". Due to improved validation, these will no longer be accepted. Support for ISO 8601 and other formats has also been improved. See https://www.mediawiki.org/wiki/Timestamp for details on the formats that will be supported. _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 0

New views on Wiki Replicas to help with slow actor and comment queries
by Brooke Storm 10 Jun '19

10 Jun '19

The actor and comment views on the wiki replicas are slowed by a need to make subqueries against 8 other tables in order to determine which rows should and should not be visible on the replica service. With recent changes to the replica view schema, this problem has become much more visible. The WMCS team has deployed a set of specialized views of these two tables that will allow individual queries to only be slowed by a single subquery against a related target, eg. a query for an actor mentioned in the log_actor field of the logging table could be made against actor_logging, which will only check against logic in the actor table--not 7 other tables that aren't related to the query. On the flip side the actor_logging view will only have rows that are exposed in the logging table. For more information, see: https://wikitech.wikimedia.org/wiki/News/Actor_storage_changes_on_the_Wiki_… If other documentation about the Wiki Replicas on wikitech needs updating related to this change, we would like your help finding it! Please let us know on IRC, phab task, email or on wiki if you find things that need updating related to the actor and comment tables. A Phabricator task is already open to update the MediaWiki documentation related (https://phabricator.wikimedia.org/T225007), but it is likely that there are bits around wikitech to update as well. Brooke Storm Operations Engineer Wikimedia Cloud Services bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org> IRC: bstorm_

1 0

Dropping user text columns from replica views 2019-05-27
by Brooke Storm 04 Jun '19

04 Jun '19

Similar to the earlier removal of text fields from the wiki replicas for comment storage refactors in Mediawiki, we are going to remove “user text” columns from the views that are deprecated in the Mediawiki schema to prepare for when they will actually be removed upstream. The column drops are tracked and explained here https://phabricator.wikimedia.org/T223406 <https://phabricator.wikimedia.org/T223406>. The tables with names such as <tablename>_compat will not see a difference in structure. The change is scheduled for Monday, May 27th. The fields that are dropping from the views are: revision: rev_user and rev_user_text. archive: ar_user and ar_user_text. ipblocks: ipb_by and ipb_by_text. image: img_user and img_user_text. oldimage: oi_user and oi_user_text. filearchive: fa_user and fa_user_text. recentchanges: rc_user and rc_user_text. logging: log_user and log_user_text. Ideally, tools that connect to the replicas should gather the information from the appropriate entries in the actor table instead, again, this is similar to the change for the comment table. The data is already there for you to start using. The alternative is to try using the related <tablename>_compat table, which won’t be changing in a user-visible way at this time. Brooke Storm Operations Engineer Wikimedia Cloud Services bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org> IRC: bstorm_

1 3

cloudservices1003 rebuild on 2019-06-03
by Arturo Borrero Gonzalez 03 Jun '19

03 Jun '19

Hi! On 2019-06-03 UTC+2 14:00 (next monday) we will be rebuilding the cloudservices1003 server, that holds the designate service which serves DNS request for CloudVPS and Toolforge. We have a backup server -cloudservices1004-, so we don't expect a lot of downtime. But DNS queries are really fast, and there may be a lot of them that will fail while we stabilize the DNS service. Please reach out to the WMCS team if you need more details or have any doubts. regards. -- Arturo Borrero Gonzalez Operations Engineer / Wikimedia Cloud Services Wikimedia Foundation

1 1

NFS scratch mount changes 2019-05-28@1800 UTC
by Brooke Storm 28 May '19

28 May '19

As part of the efforts to retire labstore1003 and move to more modern hardware with some redundancy, we will begin the process of switching the mounts for /data/scratch to the new server starting 2019-05-28. It is advised to not use the /data/scratch NFS mount during the maintenance, starting next Tuesday at 1800 UTC until after it is announced over, which should be around an hour, since the mount will be changing location and will be generally somewhat unstable. NFS changes have occasionally caused more holistic problems within Toolforge in the past, but this should be fairly low-impact since it isn’t affecting /data/project or /home. Brooke Storm Operations Engineer Wikimedia Cloud Services bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org> IRC: bstorm_

1 2

[Toolforge] Node.js 10 and Python 3.5 types now available for use by Kubernetes webservices
by Bryan Davis 19 May '19

19 May '19

Good news from the Wikimedia Hackathon in Prague! We now have some newer language runtimes for Node.js and Python3 available for Kubernetes webservices. These newer versions match the versions that were added for grid engine webservices when we upgraded to Debian Stretch. These new versions are available in parallel with the older Node.js 6.11 and Python 3.4 versions. This will be the pattern used in the future when we add all newer language runtime versions so that migrations are a bit easier for all existing users. The new type names are: * node10 * python3.5 == Node.js 10 == $ webservice --backend=kubernetes node10 shell Defaulting container name to interactive. Use 'kubectl describe pod/interactive -n bd808-test' to see all of the containers in this pod. If you don't see a command prompt, try pressing enter. $ nodejs --version v10.4.0 $ npm --version 6.5.0 $ logout Session ended, resume using 'kubectl attach interactive -c interactive -i -t' command when the pod is running Pod stopped. Session cannot be resumed. == Python 3.5 == $ webservice --backend=kubernetes python3.5 shell Defaulting container name to interactive. Use 'kubectl describe pod/interactive -n bd808-test' to see all of the containers in this pod. If you don't see a command prompt, try pressing enter. $ python3 --version Python 3.5.3 $ logout Session ended, resume using 'kubectl attach interactive -c interactive -i -t' command when the pod is running Pod stopped. Session cannot be resumed. Bryan, on behalf of the Toolforge admin team -- Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org> [[m:User:BDavis_(WMF)]] Manager, Technical Engagement Boise, ID USA irc: bd808 v:415.839.6885 x6855

1 0

Electric maintenance on 2019-05-16
by Arturo Borrero Gonzalez 16 May '19

16 May '19

Hi! on 2019-05-16 13:00 UTC there will be a maintenance operation in one of the Wikimedia Foundation datacenter racks that affects 2 of our servers running virtual machines [0]. There is a risk that this maintenance operation can result in power loss of the servers, affecting the virtual machines running on it. However, there is no way to know for sure if there will be any outage at all. If you are an admin of any of the VMs in the list and you want the VM to be reallocated into other servers previous to the operation, please get in touch with us as soon as possible. Remember that, right now, reallocating the VM to other server means shutting down the VM briefly. Here is a list of affected virtual machines: cloudvirt1028.eqiad.wmnet: af-puppetdb01.automation-framework.eqiad.wmflabs bastion-eqiad1-02.bastion.eqiad.wmflabs fridolin.catgraph.eqiad.wmflabs cloud-puppetmaster-02.cloudinfra.eqiad.wmflabs cloudstore-dev-01.cloudstore.eqiad.wmflabs commtech-nsfw.commtech.eqiad.wmflabs clm-test-01.community-labs-monitoring.eqiad.wmflabs cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs deployment-db05.deployment-prep.eqiad.wmflabs deployment-memc05.deployment-prep.eqiad.wmflabs deployment-sca01.deployment-prep.eqiad.wmflabs deployment-pdfrender02.deployment-prep.eqiad.wmflabs ign.ign2commons.eqiad.wmflabs integration-slave-docker-1050.integration.eqiad.wmflabs integration-castor03.integration.eqiad.wmflabs api.openocr.eqiad.wmflabs osmit-umap.osmit.eqiad.wmflabs builder-envoy.packaging.eqiad.wmflabs jmm-buster.puppet.eqiad.wmflabs a11y.reading-web-staging.eqiad.wmflabs adhoc-utils01.security-tools.eqiad.wmflabs util-abogott-stretch.testlabs.eqiad.wmflabs canary1028-01.testlabs.eqiad.wmflabs stretch.thumbor.eqiad.wmflabs tools-worker-1023.tools.eqiad.wmflabs tools-proxy-04.tools.eqiad.wmflabs tools-docker-builder-06.tools.eqiad.wmflabs tools-sgewebgrid-generic-0904.tools.eqiad.wmflabs tools-sgeexec-0942.tools.eqiad.wmflabs tools-sgeexec-0941.tools.eqiad.wmflabs tools-sgeexec-0940.tools.eqiad.wmflabs tools-sgeexec-0939.tools.eqiad.wmflabs tools-sgeexec-0937.tools.eqiad.wmflabs tools-sgeexec-0929.tools.eqiad.wmflabs tools-sgeexec-0921.tools.eqiad.wmflabs tools-sgeexec-0920.tools.eqiad.wmflabs tools-sgeexec-0911.tools.eqiad.wmflabs tools-sgeexec-0909.tools.eqiad.wmflabs toolsbeta-proxy-01.toolsbeta.eqiad.wmflabs vconverter-instance.videowiki.eqiad.wmflabs perfbot.webperf.eqiad.wmflabs wdhqs-1.wikidata-history-query-service.eqiad.wmflabs cloudvirt1014.eqiad.wmnet: commonsarchive-prod.commonsarchive.eqiad.wmflabs deployment-imagescaler03.deployment-prep.eqiad.wmflabs dumps-5.dumps.eqiad.wmflabs dumps-4.dumps.eqiad.wmflabs incubator-mw.incubator.eqiad.wmflabs webperformance.integration.eqiad.wmflabs saucelabs-01.integration.eqiad.wmflabs integration-puppetmaster01.integration.eqiad.wmflabs maps-puppetmaster.maps.eqiad.wmflabs maps-wma.maps.eqiad.wmflabs mwoffliner3.mwoffliner.eqiad.wmflabs mwoffliner1.mwoffliner.eqiad.wmflabs phlogiston-5.phlogiston.eqiad.wmflabs discovery-testing-01.shiny-r.eqiad.wmflabs snuggle-enwiki-01.snuggle.eqiad.wmflabs canary-1014-01.testlabs.eqiad.wmflabs tools-sgeexec-0901.tools.eqiad.wmflabs wdqs-test.wikidata-query.eqiad.wmflabs Toolforge won't be affected by this operation. You can read more details about the datacenter operation itself in phabricator [1]. Sorry for the short notice, regards. [0] Cloud Services: reallocate workload from rack B5-eqiad https://phabricator.wikimedia.org/T223148 [1] Install new PDUs into b5-eqiad https://phabricator.wikimedia.org/T223126 -- Arturo Borrero Gonzalez Operations Engineer / Wikimedia Cloud Services Wikimedia Foundation

1 1

2024

2023

2022

2021

2020

2019

2018

2017

Cloud-announce