Hey, As a pet project of mine, I have been trying to find and fix discrepancies between the database schema in the codebase (like tables.sql in mediawiki core) and production. I wrote a script to do a basic analysis and got the results but it was so big that I needed to cluster them so now we have this reports: https://phabricator.wikimedia.org/T104459#4314828
Some are hot fixes (like tmp_2 indexes) that probably need to be added to the codebase (with proper name) or dropped from production which needs investigation.
Some are unimplemented schema changes, like user_options column in user table that was dropped in the code nine years ago but fell into the cracks and never got fixed in production.
Some are changes that partially implemented like page_no_title_convert field on page table that only appear in 32 hosts (around two third of the hosts) and there is mention of adding it in HISTORY but I can't find any mention of removing it neither its existence so I have no idea how to proceed here.
The problem is that these cases need investigation before we can hand it off to our DBAs. For example, rc_cur_time field in recentchanges table has been dropped from the mediawiki core in 2014 but it exists in production but it also still exists in oracle, potgres and mssql schemas but not in mssql, sqlite or mysql schemas. In this case, we need to fix both production and several schemas in mediawiki core.
Or text table on 21 hosts (around half of them) has an extra field called inverse_timestamp which I can't find any mention of it in the code but there used to be a field with this name in revision table that got dropped in 2005 and I have no idea how to proceed here.
It would be great if you take a look the clustered reports [1] and make a phabricator ticket for them and investigate how to proceed here. Thank you
[1]: https://phabricator.wikimedia.org/T104459#4310482
Best