Hi
Couple of things
1. JSON - that's not a very reader friendly format. Also not an ideal format
for the search
engine to consume. This is due to lack of Support for metadata and data
schema.
XML is universally supported, more human friendly and support a schema
which can
be useful way beyond their this initial .
2. Be bold but also be smart and give respect where it is due. Bots and
everyone else who's written
tools for and about MediaWiki, who made a basic assumption about the page
structure would
be broken. Many will not so readily adapt.
3. A project like wikidata - in its infancy should make every effort to be
backwards compatible,
It would be far wiser to be place wikidata into a page with wiki source
using an custom <xml/>
tag or even <cdata/> xhtml tag.
Oren Bochman
-----Original Message-----
From: wikitext-l-bounces(a)lists.wikimedia.org
[mailto:wikitext-l-bounces@lists.wikimedia.org] On Behalf Of Daniel Kinzler
Sent: Tuesday, March 27, 2012 9:14 AM
To: Wikitext-l
Cc: wikitech-l(a)lists.wikimedia.org; daniel(a)nadir-seen-fire.com
Subject: Re: [Wikitext-l] Fwd: [Wikitech-l] Cutting MediaWiki loose from
wikitext
On 27.03.2012 02:19, Daniel Friesen wrote:
Non-wikitext data is supposed to give extensions the
ability to do
things beyond WikiText. The data is always going to be an opaque form
controlled by the extension.
I don't think that low level serialized data should be visible at all
to clients. Even if they know it's there.
The serialized form of the data needs to be visible at least in the XML dump
format. How else could we transfer non-wikitext content between wikis?
Using the serialized form may also make sense for editing via the web API,
though I'm not sure yet what the best ways is here:
a) keep using the current general, text based interface with the serialized
form of the content
or b) require a specialized editing API for each content type.
Going with a) has the advantage of that it will simply work with current API
client code. However, if the client modifies the content and writes it back
without being aware of the format, it may corrupt the data. So perhaps we
should return an error when a client tries to edit a non-wikitext page "the
old way".
The b) option is a bit annoying because it means that we have to define a
potentially quite complex mapping between the content model and API's result
model (nested php arrays). This is easy enough for Wikidata, which uses a
JSON based internal model. But for, say, SVG... well, I guess the
specialized mapping could still be "escaped XML as a string".
Note that if we allow a), we can still allow b) at the same time - for
Wikidata, we will definitely implement a special purpose editing interface
that supports stuff like "add value for language x to property y", etc.
Just like database schemas change, I expect extensions
to also want to
alter the format of data as they add new features.
Indeed. This is why in addition to a data model identifier, the
serialization format is explicitly tracked in the database and will be
present in dumps and via the web API.
Also I've thought about something like this for
quite awhile. One of
the things I'd really like us to do is start using real metadata even
within normal WikiText pages. We should really replace in-page
[[Category:]] with a real string of category metadata. Which we can
then use to provide good intuitive category interfaces. ([[Category:]]
would be left in for templates, compatibility, etc...).
That could be implemented using a "multipart" content type. But I don't
want
to get into this too deeply - multipart has a lot of cool uses, but it's
beyond what we will do for Wikidata.
This case especially tells me that raw is not
something that should be
outputting the raw data, but should be something which is implemented
by whatever implements the normal handling for that serialized data.
you mean action=raw? yes, I agree. action=raw should not return the actual
serialized format. It should probably return nothing or an error for
non-text content. For multipart pages it would just return the "main part",
without the "extensions".
But the entire "multipart" stuff needs more thought. It has a lot of great
applications, but it's beyond the scope of Wikidata, and it has some
additional implications (e.g. can the old editing interface be used to edit
"just the text"
while keeping the attachments?).
-- daniel
_______________________________________________
Wikitext-l mailing list
Wikitext-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l