"Aryeh Gregor" Simetrical+wikilist@gmail.com wrote in message news:7c2a12e21001180857x24bac57fp824c019956143d59@mail.gmail.com...
On Mon, Jan 18, 2010 at 7:47 AM, Henri Sivonen hsivonen@iki.fi wrote:
- Output a very few pieces of metadata that would be useful to HTML
consumers, like license metadata. For these, we should use microdata or RDFa, maybe just with one or two vocabularies whitelisted, and it would be simplest to just let people type it into templates via wikitext. I'm pretty certain about this.
Eh? I get the feeling that we're reading from totally different song sheets here. You seem to be saying here is that you expect the use case to be 'license templates on steroids': on the image description page, we have license templates that now emit microdata/RDF/the-metadata-format-of-the-month, which can be picked up by whoever is interested. That's not MediaWiki doing anything active with the data, and it's absolutely no different from marking up infoboxes. In fact, the usecase for infoboxes is arguably stronger, because their data structure is more complicated and harder to machine-read otherwise.
What I had assumed we meant by "MediaWiki do stuff with metadata" would be to pick up metadata about an image, and then output that **wherever the image is used**. So when you view an article with an image, that use of the image has a metadata cloud that describes where the image is from, what its license is, whatever. Information that, for an external image, might not be available via JavaScript or other means. I see things like the "put-a-red-border-round-fair-use-images" script I have in my monobook being implemented just by picking out that metadata, and without having to run stacks of api queries.
That usecase is incredibly badly served by just allowing raw metadata in the image page wikitext; it's really no different to adding categories via a license template. MediaWiki needs to have that metadata stored separately from wikitext, or at least entered via wikitext in a parser-friendly way: the customary way for the parser to pick 'stuff' out of wikitext is with parser functions, magic words, link syntax, whatever.
We can always add new input formats or switch the output format later if we have good reason, though. Especially if we keep input restricted to one or two vocabularies -- or three, which for microdata is all of them right now. :)
Again, I don't know which side of the coin you're talking about: switching the output format is trivial *iff* there's a disjoint between the input and output. If MW is extracting its metadata by reading [format] out of wikitext, then *adding* new formats becomes a PITA, and *removing* formats becomes impossible. So much better to have a format-independent input system for extracting metadata, and then be able to implement any of a range of outputs as dictated by the times.
--HM