On Wed, Jan 20, 2010 at 10:02 AM, Lane, Ryan Ryan.Lane@ocean.navo.navy.mil wrote:
Why shouldn't we use a technology neutral input format? What happens if microdata is replaced by something better/easier/simpler? I also don't necessarily think we should lock users into a certain technology. If we choose a nuetral input format, users can decide which output they wish to use (via extensions).
Doesn't this same argument apply to using any new HTML feature?
On Wed, Jan 20, 2010 at 11:47 AM, Happy-melon happy-melon@live.com wrote:
You could say that we're talking about different things again; that you're talking about marking up data for external use. But there's no reason why a {{#prop:foo|bar}} magic word can't *also* output some appropriate metadata format into the wikitext. Marking up in a format-neutral syntax allows us to output metadata from wikitext *and* from MW generally, and to change *both* formats at the drop of a hat. Marking up in a particular format, whatever the format is, makes it damn near impossible (or at least hopelessly hackish) to change wikitext output from one format to another, and equally horrible for MW to collect data at all.
Okay, I'll grant that for an RDF-style use-case, parser functions are a better bet than the alternatives. However, I'm not sure that's the case for inline markup, in the limited cases where we want that (e.g., image licenses). The problem here is that you'd have to associate the metadata with particular phrases. You can't say {{#prop:license|CC-BY-SA-2.0}} and output that as proper microdata/RDFa -- or rather you could, but only by creating empty content nodes someplace. I guess that would work . . . it's not good practice if you're hand-authoring, and it would take a bit more space, but it might indeed make sense from our POV.
But then there's the question of writing it. The code for raw microdata/RDFa output is already written, and is pretty trivial besides. Is anyone willing to write core code to do this metadata abstraction with a parser function, and output in appropriate formats? If not, the choice is microdata, RDFa, or nothing.
On Wed, Jan 20, 2010 at 1:38 PM, Conrad Irwin conrad.irwin@googlemail.com wrote:
I do not like the idea of having a parser function that outputs the data into the article - if people want the meta-data they can query it from an API, or a dump, as opposed to screen-scraping. Perhaps meta-data on image pages is useful, but if someone wants to get licenses of all the images, surely providing a single file containing all is better than screen-scraping for it
Not for search engines. They're spidering all the pages anyway, so it's easier for them to not retrieve a separate page. Besides, how would they know how to find the metadata if it's not included or pointed to on the page in some standard format?
On Wed, Jan 20, 2010 at 7:10 PM, Manu Sporny msporny@digitalbazaar.com wrote:
Aryeh, you're quoting something that I purposefully said off-list in an attempt to save this mailing list from the RDFa/Microdata tumult.
Oops! I'm *really* sorry. I didn't notice it was an off-list reply, so I copy-pasted it into my on-list reply to Happy-melon. Gmail threads off-list replies in the same conversation and doesn't provide any obvious visual cues. So I honestly thought that was an on-list reply. Sorry about that! I think it was a very nice and thoughtful reply overall, and hope I didn't do too much damage by (partially) publicizing it.
I will be responding shortly to the remaining questions that have been unanswered during this discussion and then leaving the discussion entirely. I don't feel that we are having a productive discussion here and the damage that I fear is resulting is the rejection of both Microdata and RDFa.
Based on current discussion, I think we'll end up going with one or the other for image licenses, probably with a toggle to use whichever you prefer. If someone writes the code to do that, which is a significant "if".