Thank you Google for hiding the start of this thread in my spam folder
_<
I'm going to have to change my import tools for the new format, but that's the way it goes; it's a reasonable change. Have you checked with folks on the xml data dumps list to see who might be affected?
Ariel
Στις 23-10-2014, ημέρα Πεμ, και ώρα 09:52 -0500, ο/η Aaron Halfaker έγραψε:
I spend a lot of time processing the XML dumps that this will affect. I just wanted to chime in to say that this change makes sense to me and it won't affect my work.
-Aaron
On Thu, Oct 23, 2014 at 9:06 AM, Daniel Kinzler daniel@brightbyte.de wrote:
tl;dr:
In the xml dumps, I want to change <text> <sha1> <model> <format> to <model> <format> <text> <sha1>
However, this is a breaking change to our XML schema. See https://bugzilla.wikimedia.org/show_bug.cgi?id=72417
Background:
While trying to fix bug 72361, I ran into an issue with our current XML dump format:
The <model> and <format> tags are placed *after* the <text> tag. This means that we don't know how to handle the text when we process XML events in a stream - we'd have to buffer the text, wait until we know model and format, and then process it. A pain.
The current order has no deeper meaning - it is, indeed, my own fault: i didn't think this through when adding these tags. I propose to change the order of the tags now, to make stream processing easier.
That would technically be a breaking change to the dump format, incompatible with https://www.mediawiki.org/xml/export-0.8.xsd and export-0.9.xsd. I doubt however that any consumers rely on the current placement of <model> and <format>, as it is extremely inconvenient (compare bug 72361), but you never know.
I propose to release a new XSD version 0.10 with the order changed, and mention it in the release notes. Should be fine.
Any objections?
-- daniel
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l