As part of work to make storage of external links in MediaWiki continue to
scale without risking site stability (T312666
<https://phabricator.wikimedia.org/T312666>), we are deprecating most of
the special functionalities around proto-relative URLs (URLs that start
with // instead of https:// or http://).
Proto-relative URLs were beneficial a decade ago, when Wikimedia projects
were being served in both encrypted and unencrypted traffic (http and
https). However, since 2015, all of our traffic has been served encrypted
only, and this functionality doesn’t provide much user benefit any more for
Wikimedia wikis. With HTTP/2, a similar circumstance applies to all
As well as being low-value, our external links storage (the externallinks
table) has grown to be one of the biggest tables for each production wiki.
This is due to many duplications of URL information, added to serve
different use cases. With the changes, we are removing these duplications,
and some of the functionality. You can read more about the work in T312666
Storage of proto-relative URLs has changed to only store HTTPS URLs
Previously, if a proto-relative was added in an edit, MediaWiki internally
treated it as two links one with http:// and one with https://. From this
week forward, for all Wikimedia wikis, the storage will change to store
only https:// URLs. Once those wikis are switched to read the new database
schema, the links will be presented as https only in Special:LinkSearch and
their API counter-parts. This means effectively a proto-relative external
link will be treated like a HTTPS one. This change will also apply to
non-Wikimedia wikis using MediaWiki 1.41+.
expandurl option is deprecated and ignored in the exturlusage and extlinks
MediaWiki action API modules
This means “expandurl” argument in exturlusage and extlinks
<https://www.mediawiki.org/wiki/API:Extlinks> API modules will be ignored
and proto-relative URLs will be always expanded to HTTPS. This will happen
any time a wiki is switched to read from the new externallinks fields. (You
can track the progress in T335343
<https://phabricator.wikimedia.org/T335343>) This change will also apply to
non-Wikimedia wikis using MediaWiki 1.41+.
If your wiki heavily uses proto-relative URLs in articles' wikitext, we
recommend changing them to https instead which also improves storage as
every proto-relative URLs takes up two rows.
Amir Sarabadani, Staff Database Architect
James Forrester, Staff Software Engineer
Timo Tijhof, Principal Performance Engineer
The Code of Conduct Committee is a team of five trusted individuals (plus
five auxiliary members) with diverse affiliations responsible for general
enforcement of the Code of conduct for Wikimedia technical spaces.
Committee members are in charge of processing complaints, discussing with
the parties affected, agreeing on resolutions, and following up on their
enforcement. For more on their duties and roles, see
This is a call for community members interested in volunteering for
appointment to this committee. Volunteers serving in this role should be
experienced Wikimedians or have had experience serving in a similar
The current committee is doing the selection and will research and discuss
candidates. Six weeks before the beginning of the next Committee term,
meaning July 16, 2023, they will publish their candidate slate (a list of
candidates) on-wiki. The community can provide feedback on these
candidates, via private email to the group choosing the next Committee. The
feedback period will be two weeks. The current Committee will then either
finalize the slate, or update the candidate slate in response to concerns
raised. If the candidate slate changes, there will be another two week
feedback period covering the newly proposed members. After the selections
are finalized, there will be a training period, after which the new
Committee is appointed. The current Committee continues to serve until the
feedback, selection, and training process is complete.
If you are interested in serving on this committee or like to nominate a
candidate, please write an email to techconductcandidates AT wikimedia.org
with details of your experience on the projects, your thoughts on the code
of conduct and the committee and what you hope to bring to the role and
whether you have a preference in being auxiliary or main member of the
committee. The committee consists of five main members plus five auxiliary
members and they will serve for a year; all applications are appreciated
and will be carefully considered. The deadline for applications is *the end
of day on May 31 2023*.
Please feel free to pass this invitation along to any users who you think
may be qualified and interested.
Martin Urbanec, on behalf of the Code of Conduct Committee
We have two breaking changes to announce.
Breaking change to Rdbms Database subclasses
There are no Database subclasses currently known in Codesearch (besides
those built-in to core). But, since the class is documented as "stable to
extend" in API, an announcement is in order.
First, an explanation on what is happening. Currently, queries flow in the
Rdbms library as follows: When Database::select() or similar is called
(directly, or via the query builders), the Rdbms library internally builds
the SQL text and then calls the general-purpose function Database::query().
Subsequently, ::query() then runs several regexes over the SQL text to
figure out information about this query (which tables are affected, whether
it’s a write query, what the query “verb” is, and so on). This is taxing
and slow. The appservers at WMF, for example, spend 0.34% of CPU on web
requests just to extract information from already-formatted SQL queries for
logging purposes. This logic drastically reduces readability and
maintainability of Rdbms internals.
To fix this, we've internally introduced a new class called Query, to
represent the SQL text in a more structured way in the first place. This
avoids having to first format structure data into SQL text only to
reverse-engineer it right afterwards.
This change requires changing the protected Database::executeQuery() method
to no longer accept a string argument and instead require a Query object.
Additionally IDatabase::getTempTableWrites() has become a private method.
See the patch <https://gerrit.wikimedia.org/r/c/mediawiki/core/+/910756> or the
ticket <https://phabricator.wikimedia.org/T326181> for more information.
The public Database::query() method continues to support a string argument,
but now also supports a Query object. When given a string, it falls back to
the old regexes to extract and create a Query object. With the newly
introduced UnionQueryBuilder, we believe there are no longer cases where
MediaWiki extensions have to call Database::query, and can instead adopt
query builders for all their queries. Direct calls to Database::query
outside of MediaWiki core are now highly discouraged.
Breaking change to typehints of IDatabase
Many parts of core typehint to IDatabase, but most times these always
return a replica database. For example, ApiBase::getDB(). This means the
interface technically allows calling insert(), and this is indeed valid,
but would also immediately throw a fatal error once the method is entered.
Going forward, most of these will be typehinted to IReadableDatabase
instead, which is a narrower interface (34 public methods instead of 83).
This is not a breaking change in practice, because calls to those methods
would already produce an error, but the typehint makes it official now. This
<https://gerrit.wikimedia.org/r/c/mediawiki/core/+/911269> is the patch
changing ApiBase::getDB(), and we expect more such changes to other parts
of MediaWiki over the coming months. Those will not be individually
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>
As part of ongoing refactors and improvements to the Rdbms component in
MediaWiki, we have made some changes that might affect your work. Read
for previous changes in this area.
New DeleteQueryBuilder and UnionQueryBuilder
We introduced the chainable SelectQueryBuilder
in 2020 to create SELECT queries. This "fluent" API style allows each
segment of the query to be set through a discoverable method name with rich
documentation and parameter types. Last February we introduced
UpdateQueryBuilder in the same vein.
The new DeleteQueryBuilder similarly supersedes use of IDatabase::delete()
for building DELETE operations. Check change 913646
for how this is used in practice.
Union queries are fairly rare in our codebases. Previously, one would have
to bypass the query builders (by calling the underlying SQL text formatters
like Database::selectSqlText) and pass several raw SQL strings through
Database::unionQueries and then to Database::query. As you can see, this is
not optimal. The new UnionQueryBuilder enables you to combine several
SelectQueryBuilder objects and then use the familiar fetchResultSet()
method on the resulting UnionQueryBuilder object. You can see an example of
this in change 906101
In February, we introduced the LBFactory->getPrimaryConnection() and
LBFactory->getReplicaConnection() methods to improve ergonomics around
getting database connections, and the IReadableDatabase typehint for
We're now introducing the IConnectionProvider typehint as a stable
interface holding the above methods without the rest of LBFactory. Meaning,
if you all you need is a connection, you can typehint to
IConnectionProvider. This is an extremely narrow interface (four public
methods!) compared to LBFactory (42 public methods). This will reduce
coupling and should ease testing as well. We already adopted
IConnectionProvider in several MediaWiki core components, which made a
notable dent in MediaWiki's PhpMetrics complexity score
This backwards-compatible change can be adopted through only a change in
typehint. MediaWiki’s LBFactory service implements IConnectionProvider.
Dependency injection remains unchanged.
You can see an example in change 913649
We recommend changing variable names like $lbFactory to $dbProvider for
consistency in usage.
We will continue transforming MediaWiki core to adopt query builders,
IConnectionProvider, and IReadableDatabase. And, we'll continue helping
various core components and extensions to phase out their direct calls to
LoadBalancer::getConnection(). This should improve code health,
readability, testability, and overall coupling scores in MediaWiki.
Our next announcement will introduce InsertQueryBuilder, and we're
considering a query builder for "UPSERT" queries as well. Lastly, we are
planning to discourage remaining use of raw SQL in "where" conditions by
introducing an expression builder (T210206
You can help
Nearly every feature of MediaWiki requires retrieving or storing
information from a database at one point or another. Reworking these is a
long tail of effort. Any help in adopting query builders, and updating code
away from ILoadBalancer and ILBFactory would be really appreciated.
Together we can improve readability and code health (now with less
Until the next update,
Amir Sarabadani (he/him), Staff Database Architect
Timo Tijhof, Principal Performance Engineer