On Mon, 13 Feb 2012 00:13:21 -0800, Gabriel Wicke <wicke(a)wikidev.net>
wrote:
On 02/13/2012 03:27 AM, Daniel Friesen wrote:
> Microdata items can be nested, so I don't
see a problem with users or
> templates providing a mapping to more specific schemas like those of
>
schema.org. Clashes of user-provided itemtypes with those used for
> editing purposes need to be prevented in the parser, but that is
> doable.
> Consumers are free to ignore itemtypes they don't know about, which is
> what Google etc are doing afaik- and what also motivated them to set up
>
schema.org in the first place.
Hmmm... wait now I'm confused, are we talking
about a Microdata DOM
output that the Parser generates from WikiText. Or a completely tailored
one where the template itself is authored in Microdata so that it can
describe how a Visual Editor should edit it?
I considered the case where users manually add a microdata item in a
template or page. The itemtype in that case can be anything, but would
most likely be a standard type.
Then I'm saying that I don't like
itemtype being abused to be the template name and itemname being abused
to be the template argument name and instead of the template name and
parameter names being abused as the schema of the template having a more
verbose proper set of Microdata to describe it:
Could you elaborate why you consider one use of itemtype an abuse, while
the other would be fine?
An itemtype is supposed to be a proper type of what the data is. Something
expected, well-known, predefined. If possible there is should be only one
for some type of thing. And one should be able to query for it already
knowing what that type is, like one would with an xmlns.
itemtype="http://en.wikipedia.org/wiki/Template:Cite" is not something
pre-defined. It practically appears dynamically out of no-where with no
forethought. And if someone copies the template then that exact same set
of data has a completely different itemtype despite being the same thing.
Another point in this example. Template:Cite is actually a good example
here.
In a normal itemtype you generally stick to one name for something. You
have a citation type, and you have a "firstname" prop. And you can have
multiples of them. ie: <span itemprop="firstname">Arnold</span>
<span
itemprop="firstname">Harold</span> (though in a real good type
you'd
likely have a separate itemtype to group all the info of a name into one
itemprop="name" itemscope ...).
However in a template we get this:
|first=Arnold
|first2=Harold
Resulting in what you'd say would be:
<span itemprop="first">Arnold</span>
<span itemprop="first2">Harold</span>
That's nothing close to a properly defined itemtype that actually allows
3rd parties to extract data in any sane way. Nor is it something a Visual
Editor would make use of without a wildcard hack where it examines every
itemtype and decides that any url pointing back to the wiki is something
it can edit. Anything that actually manages to extract data from that kind
of thing is a hack at it's very core.
While when we use
`itemtype="http://www.mediawiki.org/microdata/wikitext/Transclusion"` and
`itemprop="Argument" itemscope
itemtype="http://www.mediawiki.org/microdata/wikitext/Argument"` we have a
predefined type. We're formally describing a transclusion of a template
into another page, and the arguments used. The format of this is defined
beforehand. We can add in extra data that would have been a hack before.
Like the canonical pagename of the template. Perhaps even some metadata
that is stored inside the template itself. For example say SemanticForms
implemented some embedded editor form code. A template could add extra
metadata saying that the template's content should be edited using a
defined Semantic Forms. The Visual Editor would then use that information
to embed a small area that allows Semantic Forms to be used to edit the
template inline. Allowing editing of things that could potentially be to
complex for the Visual Editor to understand how to make editable. Though
that's really just an example off the top of my head, there are probably
other things that could use metadata from the template to improve the
Visual Editor's ability to make templates editable as intuitively as
possible.
I'm not
quite sure if we're trying to describe templates in a way that
the VisualEditor can extract the parameters from, edit them inline (if
possible), or describe the output of a template in a way that can be
read by machines for some separate purpose.
We are trying to address all three with the same mechanism. In
particular, we are trying to aid the discover of semantics associated
with (many) template parameters for the benefit of search engines or
projects like DBPedia and WikiData.
Gabriel
For those projects like DBPedia which already hack around trying to
extract data from the parameters passed to a template using tricks to
associate some sort of meaning to template parameters without getting that
information from the wiki itself using a
itemtype="http://www.mediawiki.org/microdata/wikitext/Transclusion&quo… is
basically a formal way to extract the parameters of a template without
having to do the unreliable work of attempting to parse the WikiText
themselves. So it's still a usable improvement.
For search engines and other 3rd parties, I don't believe any of them are
going to want to go around to every wiki and start hardcoding into their
code things like
itemtype="http://mywiki.com/wiki/Template:Event" and
itemtype="http://yourwiki.com/wiki/Template:OurEvent" both describing an
event they would extract. I don't think we're going to get good metadata
for general 3rd parties without actually embedding proper formal microdata
into templates themselves.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://daniel.friesen.name]