Having XML-based content would also enable a wide variety of new re-uses
of Wikimedia content. People could build all sorts of custom apps,
games, feeds, etc., without having to worry about broken syntax or
resorting to screen scraping (like we do for our mobile site). It would
also make implementing semantic features easier and thus could improve
our search capabilities. Plus it makes a great Bloody Mary!
On 1/5/11 8:26 AM, Daniel Friesen wrote:
On 11-01-05 02:09 AM, Daniel Kinzler wrote:
On 05.01.2011 05:25, Jay Ashworth wrote:
I believe the snap reaction here is "you
haven't tried to diff XML, have you?
A text-based diff of XML sucks, but how about a DOM based (structural)
I don't think a discussion on diff comparison of XML has much point.
I believe the idea floating around here (or at least the idea I'm
thinking of based on these discussions) is that we would store page text
in an xml format or a serialized php format or something else where
contents are semantically noted with things like '<template
internal="true" title="FooBar">FooBar</link>', to
actually edit this
page content we provide the data in multiple formats:
- Fully parsed output for page viewing
- A semantically marked up version of the html that is compatible with
the use of a WYSIWYG editor and can be converted back to the xml format
and then saved
- A WikiText like format similar to the WikiText we already have that
users can edit in plaintext, we use the xml and covert it into that
format, and then when the user saves parse that back into the xml format.
Naturally, if we're doing things like this, then rather than diffing the
ugly xml, the natural thing would most likely be to take the xml format
of both pages, convert it into that WikiText-like plaintext format and
show the user a diff of that so they know what meaningful changes were
made to the page.
If you really wanted to, you could also show them a diff of the end html
as an option, but that's fairly pointless.
As an extra bonus, besides enabling WYSIWYG, having that xml format also
has a good chance of making efforts of giving users an in-page diff
marking up what was actually changed in the contents itself much easier.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name