Do the Wikimedia xml dump scripts even use php / MediaWiki at all?  I am aware of some python scripts.

Please check with Ariel.

Katie

On Apr 14, 2014 12:47 PM, "Daniel Kinzler" <daniel.kinzler@wikimedia.de> wrote:
Hi all!

Context: We plan to change the XML dumps (and Special:Export) to use the same
JSON serialization that is used by the API, instead of the terse but brittle
"internal" format. This is about the mechanism we plan to use for the conversion.

SO, I just went and checked my assertion that WikiExporter will use the Content
object's serialize method to generate output. I WAS WRONG. It doesn't. I'll use
the text from the database, as-is (for reference, find the call to
Revision::getRevisionText in Export.php).

In order to force a conversion to the new format, we'll need to patch core to a)
inject a hook here to override the default behavior or b) make it always use a
Content object (unless, perhaps, told explicitly not to).

This is not hard to code, but doing it Right (tm) may need some discussion, and
getting it merged may need some time.

Sorry for not checking this earlier.
Daniel

--
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

_______________________________________________
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech