Hi everyone,
We’re happy to announce the April 2022 edition of the Technical Community
Newsletter:
https://www.mediawiki.org/wiki/Technical_Community_Newsletter/2022/April
The newsletter is compiled by the Wikimedia Developer Advocacy Team. It
aims to share highlights, news, and information of interest from and about
the Wikimedia technical community.
The Wikimedia Technical Community is large and diverse, and we know we
can't capture everything perfectly. We would love to hear your ideas for
future newsletters. Let us know what you would like to see or highlights
you would like us to include.
If you'd like to keep up with updates and information, subscribe to the
Technical Community Newsletter:
https://www.mediawiki.org/wiki/Newsletter:Technical_Community_Newsletter
Thanks,
Melinda
--
Melinda Seckington
Developer Advocacy Manager
Wikimedia Foundation <https://wikimediafoundation.org/>
Hello,
We are updating composer in the CI images from 2.1.8 to 2.3.3 which
update most jobs relying on PHP.
The "Quibble" jobs have not been upgraded due to prerequisites tasks
that have not been completed yet.
If you find something suspiciously related to the composer upgrade,
please report on the upgrade task or as a subtask:
https://phabricator.wikimedia.org/T303867
Thank you!
--
James Forrester & Antoine Musso
Hello all,
It's coming close to the time for annual appointments of community members
to serve on the Code of Conduct committee (CoCC). The Code of Conduct
Committee is a team of five trusted individuals (plus five auxiliary
members) with diverse affiliations responsible for general enforcement of
the Code of conduct for Wikimedia technical spaces. Committee members are
in charge of processing complaints, discussing with the parties affected,
agreeing on resolutions, and following up on their enforcement. For more on
their duties and roles, see
https://www.mediawiki.org/wiki/Code_of_Conduct/Committee.
This is a call for community members interested in volunteering for
appointment to this committee. Volunteers serving in this role should be
experienced Wikimedians or have had experience serving in a similar
position before.
The current committee is doing the selection and will research and discuss
candidates. Six weeks before the beginning of the next Committee term,
meaning 07 May 2022, they will publish their candidate slate (a list of
candidates) on-wiki. The community can provide feedback on these
candidates, via private email to the group choosing the next Committee. The
feedback period will be two weeks. The current Committee will then either
finalize the slate, or update the candidate slate in response to concerns
raised. If the candidate slate changes, there will be another two week
feedback period covering the newly proposed members. After the selections
are finalized, there will be a training period, after which the new
Committee is appointed. The current Committee continues to serve until the
feedback, selection, and training process is complete.
If you are interested in serving on this committee or like to nominate a
candidate, please write an email to techconductcandidates AT wikimedia.org
with details of your experience on the projects, your thoughts on the code
of conduct and the committee and what you hope to bring to the role and
whether you have a preference in being auxiliary or main member of the
committee. The committee consists of five main members plus five auxiliary
members and they will serve for a year; all applications are appreciated
and will be carefully considered. The deadline for applications is *the end
of day on 30 April 2022*.
Please feel free to pass this invitation along to any users who you think
may be qualified and interested.
Best,
Martin Urbanec, on behalf of the Code of Conduct Committee
How’d we do in our strive for operational excellence last month? Read on to find out!
Incidents
We've had quite the month, with 8 documented incidents. That's more than double the two-year median of three a month (Incident graphs <https://codepen.io/Krinkle/full/wbYMZK>).
2022-03-01 ulsfo network <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-01_ulsfo_network>
Impact: For 20 minutes, clients normally routed to Ulsfo were unable to reach our projects. This includes New Zealand, parts of Canada, and the United States west coast.
2022-03-04 esams availability banner sampling <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-04_esams_availability…>
Impact: For 1.5 hours, all wikis were largely unreachable from Europe (via Esams), with more limited impact across the globe via other data centers as well.
2022-03-06 wdqs-categories <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-06_wdqs-categories>
Impact: For 1.5 hours, some requests to the public Wikidata Query Service API were sporadically blocked.
2022-03-10 site availability <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-10_MediaWiki_availabi…>
Impact: For 12 min, all wikis were unreachable to logged-in users, and to unregistered users trying to access uncached content.
2022-03-27 api <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-27_api>
Impact: For ~4 hours, in three segments of 1-2 hours each over two days, there were higher levels of failed or slow MediaWiki API requests.
2022-03-27 wdqs outage <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-27_wdqs_outage>
Impact: For 30 minutes, all WDQS queries failed due to an internal deadlock.
2022-03-29 network <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-29_network>
Impact: For approximately 5 minutes, Wikipedia and other Wikimedia sites were slow or inaccessible for many users, mostly in Europe/Africa/Asia. (Details not public at this time.)
2022-03-31 api errors <https://wikitech.wikimedia.org/wiki/Incidents/2022-03-31_api_errors>
Impact: For 22 minutes, API server and app server availability were slightly decreased (~0.1% errors, all for s7-hosted wikis such as Spanish Wikipedia), and the latency of API servers was elevated as well.
Incident follow-up
Remember to review and schedule Incident Follow-up (Sustainability) <https://phabricator.wikimedia.org/project/view/4758/> in Phabricator, which are preventive measures and tech debt mitigations written down after an incident is concluded. Read more about past incidents at Incident status <https://wikitech.wikimedia.org/wiki/Incident_status> on Wikitech. Some recently completed sustainability work:
Add linecard diversity to router-to-router interconnect at Codfw <https://phabricator.wikimedia.org/T248506>
Filed by Chris (SRE Infra) in 2020 after an incident where all hosts in the Codfw data center lost connectivity at once. Completed by Arzhel and Cathal (SRE Infra), and Papaul (DC Ops); including in Esams where the same issue existed.
Expand parser tests to cover language conversation variants in table-of-contents output <https://phabricator.wikimedia.org/T295187>
Suggested and carried out by CScott (Parsoid) after reviewing an incident in November. The TOC on wikis that rely on the LanguageConverter service (such as Chinese Wikipedia) were no longer localized
Fix unquoted URL parameters in Icgina health checks <https://phabricator.wikimedia.org/T304323>
Suggested by Riccardo (SRE Infra) in response to an early warning signal for TLS certificate expiry. He realized that automated checks for a related cluster were still claiming to be in good health, when they in fact should have been firing a similar warning. Carried out by Filippo and Dzahn.
Provide automation to quickly show replication status when primary is down <https://phabricator.wikimedia.org/T281249>
Filed in April by Jaime (SRE Data Persistence), carried out by John and Ladsgroup.
Trends
Since the last edition, we resolved 24 of the 301 unresolved errors that carried over from previous months.
In March, we created 54 new production errors <https://phabricator.wikimedia.org/maniphest/query/ryOkF_JP6cV1/#R>. That's quite high compared to the twenty-odd reports we find most months. Of these, 17 remain open today a month later.
In the month of April, so far, we reported 20 new errors <https://phabricator.wikimedia.org/maniphest/query/1LEA6jQzf7iU/#R> of which also 17 remain open today.
The production error workboard once again adds up to exactly 298 open tasks (spreadsheet <https://docs.google.com/spreadsheets/d/e/2PACX-1vTrUCAI10hIroYDU-i5_8s7pony…>).
Take a look at the workboard and look for tasks that could use your help.
→ https://phabricator.wikimedia.org/tag/wikimedia-production-error/
Thanks!
Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!
Until next time,
– Timo Tijhof
🔗 Share or read later via https://phabricator.wikimedia.org/phame/post/view/283/
*tl;dr: *What We Learned from Trainsperiment Week
<https://phabricator.wikimedia.org/phame/post/view/281/what_we_learned_from_…>
Release Engineering took the feedback from the Trainsperiment survey and
posted it on our blog
<https://phabricator.wikimedia.org/phame/post/view/281/what_we_learned_from_…>—there
are a lot of cool charts to see!
Trainsperiment week happened the week of March 21st when we deployed
MediaWiki versions 1.39.0-wmf.1–1.39.0-wmf.4 in a single week.
Thank you to everyone who took the time to give us feedback, and worked
with us while we tried something new.
<3
Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation
Hello,
(If you don't query revision_actor_temp table, feel free to ignore this
email)
We will remove revision_actor_temp in two weeks, this table was only built
for temporary use inside mediawiki.
Cloud replicas provide a view that you can query the revision table
directly and instead of joining the revision table with
revision_actor_temp, you can simply get the value of rev_actor field.
We have finally backfilled the value of rev_actor in production and that
can be used directly (thus we will remove the view soon).
If you query this table in your tools, reports, etc. You need to change it
ASAP. Hopefully this should make your queries much simpler. Keep in mind a
similar work will happen on revision_comment_temp table in the future.
You can follow the work in https://phabricator.wikimedia.org/T275246
Best
--
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>
The 1.39.0-wmf.8 version of MediaWiki is blocked[0].
The new version is deployed to testwikis[1], but can proceed no
further until these issues are resolved:
* Fatal exception of type "UnexpectedValueException" when attempting
to block - https://phabricator.wikimedia.org/T305786
* Templates get transcluded in (un)delete reason for associated talk \
page - https://phabricator.wikimedia.org/T306431
Once these issues are resolved, train can resume. If these issues are
resolved on a Friday the train will resume Monday.
Thank you for your help resolving these issues!
-- Your humble train toiler
[0]. <https://phabricator.wikimedia.org/T305214>
[1]. <https://versions.toolforge.org/>
--
Jeena Huneidi
Software Engineer, Release Engineering
Wikimedia Foundation
Hello everyone,
The third workshop on the topic of "Writing Pywikibot scripts" is coming up
- it will take place on Friday, April 29th at 16:00 UTC. You can find more
details on the workshop and a link to join here: <
https://meta.wikimedia.org/wiki/Small_wiki_toolkits/Workshops#How_to_write_…>
[1].
This workshop will introduce participants to writing basic scripts via the
Pywikibot framework. We will be focusing on examples of scripts that
participants have requested to cover in the workshop (e.g., finding and
replacing content, archiving discussions, etc.). You can add your ideas to
the ongoing discussion in the etherpad doc linked from the workshops page. If
you missed attending the previous two workshops, going through the workshop
materials beforehand would be beneficial.
We look forward to your participation!
Best,
Srishti
On behalf of the SWT Workshops Organization team
[1]
https://meta.wikimedia.org/wiki/Small_wiki_toolkits/Workshops#How_to_write_…
*Srishti Sethi*
Senior Developer Advocate
Wikimedia Foundation <https://wikimediafoundation.org/>