Dear Ariel,
I have been reading your code for `mwxml2sql-0.0.2' with a view towards updating it for mediawiki-1.23 LTS.
0) Support status
Currently, the version info for `mwxml2sql' states the following:
(shell)$ mwxml2sql --version mwxml2sql 0.0.2 Supported input schema versions: 0.4 through 0.8. Supported output MediaWiki versions: 1.5 through 1.21.
1) Current input schema version
Currently, your XML dump files have the following header:
(shell)$ head -n 1 zuwiki-20140121-pages-articles.xml <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.8/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.8/ http://www.mediawiki.org/xml/export-0.8.xsd" version="0.8" xml:lang="zu">
From this I gather that XML schema is still 0.8, and that `mwxml2sql'
needs no update on that head.
2) Current output MediaWiki version
I reviewed the database schema for the `page', `revision', and `text' tables:
https://www.mediawiki.org/wiki/Manual:Page_table, https://www.mediawiki.org/wiki/Manual:Revision_table, and https://www.mediawiki.org/wiki/Manual:Text_table
It appears that the most recent changes to the schema for these three tables occurred for mediawiki versions 1.21, 1.21, and 1.19, respectively.
From this I gather that the database schema used for mediawiki 1.23
LTS is the same as that used for mediawiki 1.21; and therefore `mwxml2sql' needs no update on that head.
3) Recommended updates
From a review of your code, I concluded that two minor changes would be useful.
3.1) mwxml2sql.c
The following three lines:
(shell)$ grep 21 mwxml2sql.c fprintf(stderr,"Supported output MediaWiki versions: 1.5 through 1.21.\n\n"); /* we know MW 1.5 through MW 1.21 even though there is no MW 1.21 yet */ if (mwv->major != 1 || mwv->minor < 5 || mwv->minor > 21) {
should read
fprintf(stderr,"Supported output MediaWiki versions: 1.5 through 1.23.\n\n"); /* we know MW 1.5 through MW 1.23 */ if (mwv->major != 1 || mwv->minor < 5 || mwv->minor > 23) {
3.2) mwxmlelts.c
The following line:
(shell)$ grep 21 mwxmlelts.c <generator>MediaWiki 1.21wmf6</generator>
should read
<generator>MediaWiki 1.23wmf10</generator>
4) Request
Please let me know if you agree with the above assessment. If you do, I would be happy to submit the changes to https://gerrit.wikimedia.org/ for review.
Sincerely Yours, Kent
xmldatadumps-l@lists.wikimedia.org