On Mon, Jul 1, 2013 at 10:15 PM, Daniel Friesen daniel@nadir-seen-fire.comwrote:
How are you dealing with extensibility?
We need to be able to extend the format. The fields of data we need to export change over time (just look at the changelog for our export's XSD file https://www.mediawiki.org/xml/**export-0.7.xsdhttps://www.mediawiki.org/xml/export-0.7.xsd ).
I have touched on this in answer to Ariel's email. I think that for now, there will be just a single data version number in the header of the dump file. But I will make sure to leave the possibility of having a version number on each object open.
Here are some things in that XML format you are missing in the incremental:
- Redirect info
- Upload info
- Log items
- Liquid Threads support
I should have gone to the source instead of assuming that looking at few samples is enough. I will add redirect and upload info to the format description.
As far as I know, log items are in a separate XML dump and I'm not planning to replace that one.
Unless I'm mistaken, Liquid Threads don't have much of a future and are used only on few wikis like mediawiki.org. Does anyone actually use this information from the dumps?
And something that I don't think we've thought about support for in our current export format, ContentHandler. There's metadata for it missing from our dumps and the data format is somewhat different than our text dumps have traditionally expected.
The current dumps already store model and format. Is there something else needed for ContentHandler? The dumps don't really care what is the format or encoding of the revision text, it's just a byte stream to them.
Petr Onderka