(If you don’t work with links tables such as templatelinks, pagelinks and
so on, feel free to ignore this message)
TLDR: The schema of links tables (starting with templatelinks) will change
to have numeric id pointing to linktarget table instead of repeating
namespace and title.
Hello,
The current schema and storage of most links tables are: page id (the
source), namespace id of the target link and title of the target. For
example, if a page with id of 1 uses Template:Foo, the row in the database
would be 1, 6, and Foo (Template namespace has id of 6)
Repeating the target’s title is not sustainable, for example more than half
of Wikimedia Commons database is just three links tables. The sheer size of
these tables makes a considerable portion of all queries slower, backups
and dumps taking longer and taking much more space than needed due to
unnecessary duplication. In Wikimedia Commons, on average a title is
duplicated around 100 times for templatelinks and around 20 times for
pagelinks. The numbers for other wikis depend on the usage patterns.
Moving forward, these tables will be normalized, meaning a typical row will
hold mapping of page id to linktarget id instead. Linktarget is a new table
deployed in production and contains immutable records of namespace id and
string. The major differences between page and linktarget tables are: 1-
linktarget values won’t change (unlike page records that change with page
move) 2- linktarget values can point to non-existent pages (=red links).
The first table being done is templatelinks, then pagelinks, imagelinks and
categorylinks will follow. During the migration phase both values will be
accessible but we will turn off writing to the old columns once the values
are backfilled and switched to be read from the new schema. We will
announce any major changes beforehand but this is to let you know these
changes are coming.
While the normalization of all links tables will take several years to
finish, templatelinks will finish in the next few months and is the most
pressing one.
So if you:
-
… rely on the schema of these tables in cloud replicas, you will need to
change your tools.
-
… rely on dumps of these tables, you will need to change your scripts.
Currently, templatelinks writes to both data schemes for new rows in most
wikis. This week we will start backfilling the data with the new schema but
it will take months to finish in large wikis.
You can keep track of the general long-term work in
https://phabricator.wikimedia.org/T300222 and the specific work for
templatelinks in https://phabricator.wikimedia.org/T299417. You can also
read more on the reasoning in https://phabricator.wikimedia.org/T222224.
Thanks
--
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>
How are we doing in our strive for operational excellence? Read on to find out!
Incidents
There were 6 incidents in June this year. That's double the median of three per month, over the past two years (Incident graphs <https://codepen.io/Krinkle/full/wbYMZK>).
2022-06-01 cloudelastic <https://wikitech.wikimedia.org/wiki/Incidents/2022-06-01_Lost_index_in_clou…>
Impact: For 41 days, Cloudelastic was missing search results about files from commons.wikimedia.org.
2022-06-10 overload varnish haproxy <https://wikitech.wikimedia.org/wiki/Incidents/2022-06-10_overload_varnish_h…>
Impact: For 3 minutes, wiki traffic was disrupted in multiple regions for cached and logged-in responses.
2022-06-12 appserver latency <https://wikitech.wikimedia.org/wiki/Incidents/2022-06-12_appserver_latency>
Impact: For 30 minutes, wiki backends were intermittently slow or unresponsive, affecting a portion of logged-in requests and uncached page views.
2022-06-16 MariaDB password <https://wikitech.wikimedia.org/wiki/Incidents/2022-06-16_MariaDB_password_l…>
Impact: For 2 hours, a current production database password was publicly known. Other measures ensured that no data could be compromised (e.g. firewalls and selective IP grants).
2022-06-21 asw-a2-codfw power <https://wikitech.wikimedia.org/wiki/Incidents/2022-06-21_asw-a2-codfw_accid…>
Impact: For 11 minutes, one of the Codfw server racks lost network connectivity. Among the affected servers was an LVS host. Another LVS host in Codfw automatically took over its load balancing responsibility for wiki traffic. During the transition, there was a brief increase in latency for regions served by Codfw (Mexico, and parts of US/Canada).
2022-06-30 asw-a4-codfw power <https://wikitech.wikimedia.org/wiki/Incidents/2022-06-30_asw-a4-codfw_accid…>
Impact: For 18 minutes, servers in the A4-codfw rack lost network connectivity. Little to no external impact.
Incident follow-up
Recently completed incident follow-up:
Audit database usage of GlobalBlocking extension <https://phabricator.wikimedia.org/T307648>
Filed by Amir (Ladsgroup) in May following an outage due to db load from GlobalBlocking. Amir reduced the extensions' DB load by 10%, through avoiding checks for edit traffic from WMCS and Toolforge. And he implemented stats for monitoring GlobalBlocking DB queries going forward.
Reduce Lilypond shellouts from VisualEditor <https://phabricator.wikimedia.org/T312319>
Filed by Reuven (RLazarus) and Kunal (Legoktm) after a shellbox incident. Ed (Esanders) and Sammy (TheresNoTime) improved the Score extension's VisualEditor plugin to increase its debounce duration.
Remember to review and schedule Incident Follow-up work <https://phabricator.wikimedia.org/project/view/4758/> in Phabricator! These are preventive measures and tech debt mitigations written down after an incident is concluded. Read more about past incidents at Incident status <https://wikitech.wikimedia.org/wiki/Incident_status> on Wikitech.
Trends
In June and July (which is almost over), we reported 27 new production errors <https://phabricator.wikimedia.org/maniphest/query/WDqlrITVmIoX/#R> and 25 production errors <https://phabricator.wikimedia.org/maniphest/query/pzOAOpbnF3PX/#R> respectively. Of these 52 new issues, 27 were closed in weeks since then, and 25 remain unresolved and will carry over to August.
We also addressed 25 stagnant problems that we carried over from previous months, thus the workboard overall remains at exactly 299 unresolved production errors.
Take a look at the Wikimedia-production-error <https://phabricator.wikimedia.org/tag/wikimedia-production-error/> workboard and look for tasks that could use your help.
💡 *Did you know?* To zoom in and find your team's error reports, use the appropriate "Filter" link in the sidebar of the workboard .
For the month-over-month numbers, refer to the spreadsheet data <https://docs.google.com/spreadsheets/d/e/2PACX-1vTrUCAI10hIroYDU-i5_8s7pony…>.
Thanks!
Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!
Until next time,
– Timo Tijhof
🔗 Share or read later via https://phabricator.wikimedia.org/phame/post/view/292/
Hi everyone,
We’re happy to share the July 2022 edition of the Technical Community
Newsletter:
https://www.mediawiki.org/wiki/Technical_Community_Newsletter/2022/July
The newsletter is compiled by the Wikimedia Developer Advocacy Team. It
aims to share highlights, news, and information of interest from and about
the Wikimedia technical community.
The Wikimedia Technical Community is large and diverse, and we know we
can't capture everything perfectly. We would love to hear your ideas for
future newsletters. Got something you would like to see or something you
want to highlight in the next quarterly newsletter? Add your suggestion to
the talk page:
https://www.mediawiki.org/wiki/Talk:Technical_Community_Newsletter
If you'd like to keep up with updates and information, subscribe to the
Technical Community Newsletter:
https://www.mediawiki.org/wiki/Newsletter:Technical_Community_Newsletter
Thanks,
Melinda
--
Melinda Seckington
Developer Advocacy Manager
Wikimedia Foundation <https://wikimediafoundation.org/>
The Search Platform Team
<https://www.mediawiki.org/wiki/Wikimedia_Search_Platform> usually holds an
open meeting on the first Wednesday of each month. Come talk to us about
anything related to Wikimedia search, Wikidata Query Service (WDQS),
Wikimedia Commons Query Service (WCQS), etc.!
Feel free to add your items to the Etherpad Agenda for the next meeting.
Details for our next meeting:
Date: Wednesday, August 3rd, 2022
Time: 15:00-16:00 UTC / 08:00-09:00 PDT / 11:00-12:00 EDT / 16:00-17:00 WAT
/ 17:00-18:00 CEST
Etherpad: https://etherpad.wikimedia.org/p/Search_Platform_Office_Hours
Google Meet link: https://meet.google.com/vgj-bbeb-uyi
Join by phone: https://tel.meet/vgj-bbeb-uyi?pin=8118110806927
Hope to talk to you next week!
—Trey
Trey Jones
Staff Computational Linguist, Search Platform
Wikimedia Foundation
UTC–4 / EDT
Hey all,
Out of an abundance of caution, please hold all non-emergency deploys
until we have a fix for this one:
* scap no longer restarts php-fpm on canary servers
- https://phabricator.wikimedia.org/T313770
For urgent situations, be aware that:
1. Canary servers don't currently offer any assurances, since they
aren't currently being restarted after code is synced.
2. You'll need to restart PHP on:
{mw1414,mw1447,mw1415,mw1417,mw1418,mw1449,mw1450,mw1416,mw1448}.eqiad.wmnet
Thanks, and sorry for the disruption. We expect to return you to your
regularly scheduled deployments shortly!
(CC'd to listed deployers for upcoming windows.)
--
Brennen Bearnes
Release Engineering
Wikimedia Foundation
Dear all,
in a regular MediaWiki installation switched to de-formal, all help links point to the English versions. This is probably because there is not de-formal version of the MediaWiki documentation, which of course is fine. But all links to SpecialMyLanguage should lead to the German documentation, for settings of "de", "de-formal", "de-at" etc.
As an example: the help icon on Special:Recent changes leads to https://meta.wikimedia.org/wiki/Special:MyLanguage/Help:Recent_changes
But if you set your wiki to de-formal, it will lead you to the English help pages instead the German ones.
Best,
Bernhard
Hi all,
I originally sent this email out on 20 May, 2022, but it seems like it
didn’t go out to everybody, unfortunately.
We are beginning the process of undeploying API Feature Usage
<https://phabricator.wikimedia.org/T313248> this week, and apologize for
the confusion, and what is now short notice for those of you who were not
aware of this announcement previously.
Best,
—
*Mike Pham* (he/him)
Sr Product Manager, Search
Wikimedia Foundation <https://wikimediafoundation.org/>
On 20May, 2022 at 10:18:44, Mike Pham (mpham(a)wikimedia.org) wrote:
Hi all,
The Wikimedia Foundation Search team is currently working on updating our
Elasticsearch version to 7.10 <https://phabricator.wikimedia.org/T263142>
from version 6.8.20. Our goal is to finish this work in the next couple of
months. Being on the latest (license-compatible) version will help ensure
the stability and reliability of our search infrastructure.
This update will include 2 breaking changes:
1.
Cloudelastic will be affected by breaking changes, with more details
below. The interface might also change slightly.
1.
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/breaking-chang…
2.
https://www.elastic.co/guide/en/elasticsearch/reference/7.10/migrating-7.10…
2.
The API Feature Usage extension
<https://www.mediawiki.org/wiki/Extension:ApiFeatureUsage>, whose
functionality is unrelated to search, will no longer be supported. This
low-usage extension is currently implemented in a complicated and brittle
way that depends on Elasticsearch, which creates development drag on the
Search team’s work in order to continuously maintain and upkeep.
While in the short term API Feature Usage will be sunsetted, we recognize
that it’s probably useful for some users, and we encourage others to
continue to support and develop this extension in the longer term, without
its brittle dependency on Elasticsearch.
Though the Elasticsearch upgrade will provide overall net benefits, we
recognize that these breaking changes will unfortunately affect some users,
and appreciate your understanding as we improve our search infrastructure.
Best,
Search Platform
—
*Mike Pham* (he/him)
Sr Product Manager, Search
Wikimedia Foundation <https://wikimediafoundation.org/>