Page history structure isn't quite immutable; revisions may be added or deleted, pages may be renamed, etc etc.
Shelling out to an external process means when that process dies due to a dead database connection etc, we can restart it cleanly.
Brion, thanks for clarifying it.
Also, I want to ask you and other developers about the idea of packing export XML file along with all exported uploads to ZIP archive (instead of putting them to XML in base64) - what do you think about it? We use it in our Mediawiki installations ("mediawiki4intranet") and find it quite convenient. Actually, ZIP was the idea of Tim Starling, before ZIP we used very strange "multipart/related" archives (I don't know why we did it :))
I want to try to get this change reviewed at last... What do you think about it?
Other improvements include advanced page selection (based on namespaces, categories, dates, imagelinks, templatelinks and pagelinks) and an advanced import report (including some sort of "conflict detection"). I should probably need to split them to separate patches in Gerrit for the ease of review?
Also, do all the archiving methods (7z) really need to be built in the Export.php as dump filters? (especially when using ZIP?) I.e. with simple XML dumps you could just pipe the output to the compressor.
Or are they really needed to save the temporary disk space during export? I ask because my version of import/export does not build the archive "on-the-fly" - it puts all the contents to a temporary directory and then archives it fully. Is it an acceptable method?