Re: [Wikitech-l] RDFa and Microdata in MediaWiki

19 Jan 2010


      "Aryeh Gregor" Simetrical+wikilist@gmail.com wrote in message 
news:7c2a12e21001180857x24bac57fp824c019956143d59@mail.gmail.com...
...
On Mon, Jan 18, 2010 at 7:47 AM, Henri Sivonen hsivonen@iki.fi wrote:

Output a very few pieces of metadata that would be useful to HTML

consumers, like license metadata.  For these, we should use microdata
or RDFa, maybe just with one or two vocabularies whitelisted, and it
would be simplest to just let people type it into templates via
wikitext.  I'm pretty certain about this.
Eh?  I get the feeling that we're reading from totally different song sheets 
here.  You seem to be saying here is that you expect the use case to be 
'license templates on steroids': on the image description page, we have 
license templates that now emit 
microdata/RDF/the-metadata-format-of-the-month, which can be picked up by 
whoever is interested.  That's not MediaWiki doing anything active with the 
data, and it's absolutely no different from marking up infoboxes.  In fact, 
the usecase for infoboxes is arguably stronger, because their data structure 
is more complicated and harder to machine-read otherwise.
What I had assumed we meant by "MediaWiki do stuff with metadata" would be 
to pick up metadata about an image, and then output that **wherever the 
image is used**.  So when you view an article with an image, that use of the 
image has a metadata cloud that describes where the image is from, what its 
license is, whatever.  Information that, for an external image, might not be 
available via JavaScript or other means.  I see things like the 
"put-a-red-border-round-fair-use-images" script I have in my monobook being 
implemented just by picking out that metadata, and without having to run 
stacks of api queries.
That usecase is incredibly badly served by just allowing raw metadata in the 
image page wikitext; it's really no different to adding categories via a 
license template.  MediaWiki needs to have that metadata stored separately 
from wikitext, or at least entered via wikitext in a parser-friendly way: 
the customary way for the parser to pick 'stuff' out of wikitext is with 
parser functions, magic words, link syntax, whatever.
...
We can always add new input formats or switch the output format later
if we have good reason, though.  Especially if we keep input
restricted to one or two vocabularies -- or three, which for microdata
is all of them right now.  :)
Again, I don't know which side of the coin you're talking about: switching 
the output format is trivial *iff* there's a disjoint between the input and 
output.  If MW is extracting its metadata by reading [format] out of 
wikitext, then *adding* new formats becomes a PITA, and *removing* formats 
becomes impossible.  So much better to have a format-independent input 
system for extracting metadata, and then be able to implement any of a range 
of outputs as dictated by the times.
--HM

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] RDFa and Microdata in MediaWiki