đ (est. 3 minute read)
https://phabricator.wikimedia.org/phame/live/1/post/140/
-------
Howâd we do in our strive for operational excellence last month? Read on to
find out!
- Month in numbers.
- Highlighted stories.
- Current problems.
## đ *Month in numbers*
* 4 documented incidents in January 2019. [1]
* 16 Wikimedia-prod-error tasks closed. [2]
* 17 Wikimedia-prod-error tasks created. [3]
## *ď¸âŁ *Unable to move certain file pages*
Xiplus reported that renaming a File page on
zh.wikipedia.org led to a
fatal database exception. Andre Klapper identified the stack trace from the
logs, and Brad (Anomie) investigated.
The File renaming failed because the File page did not have a media file
associated with it (such move action is not currently allowed in
MediaWiki). But, while handling this error the code caused a different
error. The impact was that the user didn't get informed about why the move
failed. Instead, they received a generic error page about a fatal database
exception.
Brad fixed the code a few hours later, and it was deployed by Roan later
that same day.
Thanks! â
https://phabricator.wikimedia.org/T213168
## *ď¸âŁ *DBPerformance regression detected and fixed*
During a routine audit of Logstash dashboards, I found a DBPerformance
warning. The warning indicated that the limit of 0 for âmaster connectionsâ
was violated. That's a cryptic way of saying it found code in MediaWiki
that uses a database master connection on a regular page view.
MediaWiki can have many replica database servers, but there can be only one
master database at any given moment. To reduce chances of overload,
delaying edits, or network congestion; we make sure to use replicas
whenever possible. We usually involve the master only when source data is
being changed, or is about to be changed. For example, when editing a page,
or saving changes.
As the vast majority of traffic is page views, we have lower thresholds for
latency and dependency on page views. In particular, page views may (in the
future) be routed to secondary data centres that donât even have a master
DB.
Tchanders from the Anti-Harassment tea) investigated the issue, found the
culprit, and fixed it in time for the next MediaWiki train. Thanks! â
https://phabricator.wikimedia.org/T214735
## *ď¸âŁ *TemplateData missing in action*
Tacsipacsi and Evad37 both independently reported the same TemplateData
issue. TemplateData powers the template insertion dialog in VisualEditor.
It wasn't working for some templates after we deployed the 1.33-wmf.13
branch.
The error was âArgument 1 passed to ApiResult::setIndexedTagName() must be
an instance of array, null givenâ. This means there was code that calls a
function with the wrong parameter. For example, the variable name may've
been misspelled, or it may've been the wrong variable, or (in this case)
the variable didn't exist. In such case, PHP implicitly assumes ânullâ.
Bartosz (Matmarex) found the culprit. The week before, I made a change to
TemplateData that changed the âtemplate parameter orderâ feature to be
optional. This allows users to decide whether VisualEditor should force an
order for the parameters in the wikitext. It turned out I forgot to update
one of the references to this variable, which still assumed it was always
present.
Brad (Anomie) fixed it later that week, and it was deployed the next day.
Thanks! â
https://phabricator.wikimedia.org/T213953
## đ *Current problems*
Take a look at the workboard and look for tasks that might need your help.
The workboard lists known issues, grouped by the week in which they were
first observed.
â
https://phabricator.wikimedia.org/tag/wikimedia-production-error/
There are currently 188 open Wikimedia-prod-error tasks as of 12 February
2019. (Weâve had a slight increase since November; 165 in December, 172 in
January.)
For this monthâs edition, Iâd like to draw attention to a few older issues
that are still reproducible:
* [2013; Collection extension] Special:Book fatal error for blocked users.
â
https://phabricator.wikimedia.org/T56179
* [2013; CentralNotice] Fatal error when placeholder key contains a space.
â
https://phabricator.wikimedia.org/T58105
* [2014; LQT] Fatal error when attempting to view certain threads. â
https://phabricator.wikimedia.org/T61791
* [2015; MassMessage] Warning about Invalid message parameters. â
https://phabricator.wikimedia.org/T93110
* [2015; Wikibase] Warning âUnresolvedRedirectExceptionâ for some pages on
Wikidata (and Commons). â
https://phabricator.wikimedia.org/T93273
## đĄTerminology
A âFatal errorâ (or uncaught exception) prevents a user action. For example
â a page might display âMWException: Unknown class NotificationCount.â,
instead the article content.
A âWarningâ (or non-fatal, or PHP error) lets the program continue to
display a mostly page regardless. This may cause corrupt, incorrect, or
incomplete information to be shown. For example â a user may receive a
notification that says âYou have (null) new messagesâ.
## đ Thanks!
Thank you to everyone who has helped by reporting, investigating, or
resolving problems in Wikimedia production. Including: Xiplusâ Anomie,
Daimona Gilles, He7d3r, Jdforrester, MatmaRex, MModell, Nikerabbit,
Catrope, Tchanders, Tgr, and Thiemo.
Thanks!
Until next time,
â Timo Tijhof
đ˘*There's a snake in my boot. Reach for the sky!*
-------
Footnotes:
[1] Incidents. â
https://wikitech.wikimedia.org/wiki/Special:AllPages?from=Incident+documentâŚ
[2] Tasks closed. â
https://phabricator.wikimedia.org/maniphest/query/COTGbmxGcm_l/#R
[3] Tasks created. â
https://phabricator.wikimedia.org/maniphest/query/DLRuzOg9bSJA/#R