Hoi, You may want to wait until the dumps are fixed. Magnus fixed the one but last dump by hand. The following dump is still broken. Wait until we KNOW the dumps are ok. Thanks, GerardM
On 27 October 2014 21:58, Ariel T. Glenn aglenn@wikimedia.org wrote:
Thank you Google for hiding the start of this thread in my spam folder
_<
I'm going to have to change my import tools for the new format, but that's the way it goes; it's a reasonable change. Have you checked with folks on the xml data dumps list to see who might be affected?
Ariel
Στις 23-10-2014, ημέρα Πεμ, και ώρα 09:52 -0500, ο/η Aaron Halfaker έγραψε:
I spend a lot of time processing the XML dumps that this will affect. I just wanted to chime in to say that this change makes sense to me and it won't affect my work.
-Aaron
On Thu, Oct 23, 2014 at 9:06 AM, Daniel Kinzler daniel@brightbyte.de wrote:
tl;dr:
In the xml dumps, I want to change <text> <sha1> <model> <format> to <model> <format> <text> <sha1>
However, this is a breaking change to our XML schema. See https://bugzilla.wikimedia.org/show_bug.cgi?id=72417
Background:
While trying to fix bug 72361, I ran into an issue with our current XML dump format:
The <model> and <format> tags are placed *after* the <text> tag. This means that we don't know how to handle the text when we process
XML
events in a stream - we'd have to buffer the text, wait until we know model
and
format, and then process it. A pain.
The current order has no deeper meaning - it is, indeed, my own fault:
i
didn't think this through when adding these tags. I propose to change the
order
of the tags now, to make stream processing easier.
That would technically be a breaking change to the dump format, incompatible with https://www.mediawiki.org/xml/export-0.8.xsd and
export-0.9.xsd. I
doubt however that any consumers rely on the current placement of <model> and <format>, as it is extremely inconvenient (compare bug 72361), but you never know.
I propose to release a new XSD version 0.10 with the order changed, and mention it in the release notes. Should be fine.
Any objections?
-- daniel
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l