(Adding engineering-all@ for Wikimedia engineers' awareness.)
Good work on this Amir, James, Gergo, Reedy, DBAs, and all the others not
Cleaning up tech debt to provide a stable base for future code hygiene is
important, hard, and sometimes thankless work.
On Thu, May 27, 2021 at 4:33 PM Amir Sarabadani <ladsgroup(a)gmail.com> wrote:
If you missed the previous updates, there's the first
This is the last one I'm sending but in a good way.
We now finished migrating all 57 core tables to abstract schema. Now tables.sql
is empty for MySQL
We will soon remove the tables.sql files and links to them.
We have also cleaned more than hundreds of old schema change files.
<https://phabricator.wikimedia.org/T272199> That enabled us to actually
look for unused sql files and drop tens of unused ones that have not been
used since 2002 <https://gerrit.wikimedia.org/r/c/mediawiki/core/+/668552>,
2004 <https://gerrit.wikimedia.org/r/c/mediawiki/core/+/668761>, or 2005
<https://gerrit.wikimedia.org/r/c/mediawiki/core/+/670176> (and much
more) and they got lost in the sheer number of our sql patch files.
The next update is exciting for me. With abstraction in place, we can now
have a proper tracking of drifts between schema in paper and our production
(This is a follow up from a major incident in 2018). Now we have
that keeps track of these drifts.
Our schema has been around for more than twenty years and we have hundreds
of database hosts, making sure everything is using the right database
schema (and stays correct) is impossible manually and we have been finding
and fixing these drifts since 2018, see this comment onwards
<https://phabricator.wikimedia.org/T104459#4314828>.). Of course more
work in improving the tracker is welcome (here's the list
If you want to enjoy the benefits of abstract schema  in extension(s)
you or your team maintains, Please abstract the schema of your extension.
There's a long list of WMF deployed extensions that are not using
abstract schema <https://phabricator.wikimedia.org/T261912> and some even
already have a patch that only needs reviewing. Once that's done, we can
add that to drift tracking and have a more comprehensive list of potential
issues. If you need help with the abstraction work, just ping me.
This also helped us resolve several long-standing tickets like T104459
<https://phabricator.wikimedia.org/T104459> (5.5 years old), T62962
<https://phabricator.wikimedia.org/T62962> (7 years old), and T42626
<https://phabricator.wikimedia.org/T42626> (soon reaching its ninth
birthday) and will help us to address even more tech debt in future.
There is more to be done, improving the abstract schema *change* system,
finding a home of schema documentation, improving the drift tracker and
making it more automated, so much more. But the biggest chunk of work is
now finally done.
I really would like to thank Ammarpad for great work on abstracting the
tables and handling all sorts of edge cases, James Forrester and Tgr for
their reviews which without them this wouldn't be possible and Sam Reed who
wrote a script to speed up migration
This was a team work to its core.
 In more details, by abstracting you will have automated checks for
dirfits of the schema of extension(s) you maintain and production. You will
have Postgres support for free. Also, you can have automated documentation
generation, ability to test the schema itself, and have better consistency
of your data types (like one datatype for timestamps).
Until the next adventure.
Wikitech-l mailing list -- wikitech-l(a)lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave(a)lists.wikimedia.org
| Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
| Dir. Engineering Productivity A18D 1138 8E47 FAC8 1C7D |