Oh thanks for the great idea. I started
"abstract-schema" hashtag and
added them to as many as possible:
Best
On Mon, May 18, 2020 at 1:47 PM Gergo Tisza <gtisza(a)wikimedia.org> wrote:
Thank you so much for working on this, it was one
of the most painful
aspects of core development!
It might be worth using a consistent gerrit topic or hashtag to make
finding the relevant patches easy.
On Sat, May 9, 2020 at 3:21 AM Amir Sarabadani <ladsgroup(a)gmail.com>
wrote:
Hello,
In case you haven't done any changes on database schema of mediawiki
core,
let me explain the process to you (if you know
this, feel free to skip
this
paragraph):
* Mediawiki core supports three types of RDBMS: MySQL, Sqlite,
Postgres. It
used to be five (plus Oracle and MSSQL)
* For each one of these types, you need to do three parts: 1- Change the
tables.sql file so new installations get the new schema 2- Make .sql
schema
change file, like an "ALTER TABLE" for
current installations so they can
upgrade. 3- Wire that schema change file into *Updater.php file.
* For example, this is a patch to drop a column:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/473601 This file
touches
14 different files, adds 94 lines and removes
30.
This is bad for several reasons:
* It is extremely complicated to do a even a simple schema change.
Usually
something as simple as adding an column takes a
whole day for me. There
are
lots of complicating factors, like Sqlite
doesn't have ALTER TABLE, so
when
you want to make a patch for adding a column, you
need to make a
temporary
table with the new column, copy the old table
data to it, drop the old
table and then rename the old table.
** Imagine the pain and sorrow when you want to normalize a table
meaning
you need to do several schema changes: 1- Add a
table, 2- Add a column
on
the old table, 3- make the column not-nullable
when it's filled and make
the old column nullable instead 4- drop the old column.
* It's almost impossible to test all DBMS types, I don't have MSSQL or
Oracle installed and I don't even know their differences with MySQL. I
assume most other developers are good in one type, not all.
* Writing raw sqls, specially duplicated ones, and doubly specially
when we
don't have CI to test (because we won't
install propriety software in
our
infra) is pretty much prone to error. My
favourite one was that a new
column on a table was actually added to the wrong table in MSSQL and it
went unnoticed for two years (four releases, including one LTS).
* It's impossible to support more DBMS types through extensions or other
third party systems. Because the maintainer needs to keep up with all
patches we add to core and write their equivalents.
* For lots of reasons, these schemas are diverging, there have been
several
work to just reduce this to a minimum.
There was a RFC to introduce abstract schema and schema changes and it
got
accepted and I have been working to implement
this:
https://phabricator.wikimedia.org/T191231
This is not a small task, and like any big work, it's important to cut
it
to small pieces and gradually improve things. So
my plan is first, I
abstract the schema (tables.sql files), then slowly I abstract schema
changes. For now, the plan is to make these .sql files automatically
generated through maintenance scripts. So we will have a file called
tables.json and when running something like:
php maintenance/generateSchemaSql.php --json maintenance/tables.json
--sql
maintenance/tables-generated.sql --type=mysql
It would produce tables-generated.sql file. The code that produces it is
Doctrine DBAL and this is already installed as a dev dependency of core
because you would need Doctrine if you want to make a schema change, if
you
maintain an instance, you should not need
anything. Most of the work for
automatically generating schema is already merged and the last part that
wires it (and migrates two tables) is up for review:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240
My request is that I need to make lots of patches and since I'm doing
this
in my volunteer capacity, I need developers to
review (and potentially
help
with the work if you're excited about this
like me). Let me know if
you're
willing to be added in future patches and the
current patch also
welcomes
any feedback:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240
I have added the documentation in
https://www.mediawiki.org/wiki/Manual:Schema_changes for the plan and
future changes. The ideal goal is that when you want to do a schema
change,
you just change tables.json and create a json
file that is snapshot of
before and after table (remember, sqlite doesn't have alter table,
meaning
it has to know the whole table). Also, once we
are in a good shape in
migrating mediawiki core, we can start cleaning up extensions.
Any feedback is also welcome.
Best
--
Amir (he/him)
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l