On 02/13/2012 10:28 PM, Daniel Friesen wrote:
itemtype="http://www.mediawiki.org/microdata/wikitext/Transclusion&quo… is
basically a formal way to extract the parameters of a template without
having to do the unreliable work of attempting to parse the WikiText
themselves. So it's still a usable improvement.
The main issue I have with this style of a purely structural itemtype is
the limited pragmatic value compared to its significantly increased
cost. A relatively light-weight fragment like
<div
itemtype="http://en.wikipedia.org/wiki/Template:Foo" itemscope>
<span itemprop="firstname">The first name</span>
</div>
would be blown up to something like
<div
itemtype="http://www.mediawiki.org/microdata/wikitext/Transclusion&quo…
itemscope>
<meta itemprop="source"
data="http://en.wikipedia.org/wiki/Template:Foo" />
<span itemprop="Argument"
itemtype="http://www.mediawiki.org/microdata/wikitext/Argument" itemscope>
<meta itemprop="argname" content="firstname">
<span itemprop="argvalue">The first name</span>
</span>
</div>
This would increase the memory used for the DOM, slow down network
transfers and processing and make it unlikely that we could leave this
information in regular rendered pages.
For search engines and other 3rd parties, I don't
believe any of them
are going to want to go around to every wiki and start hardcoding into
their code things like
itemtype="http://mywiki.com/wiki/Template:Event"
and
itemtype="http://yourwiki.com/wiki/Template:OurEvent" both
describing an event they would extract. I don't think we're going to get
good metadata for general 3rd parties without actually embedding proper
formal microdata into templates themselves.
Unfortunately, they would have to do the same hardcoding with a global
Transclusion itemtype, as the only thing that allows an association of
vocabulary semantics (the template source URL in the meta element) still
contains the URL of the wiki. So the added complexity does not really
simplify the extraction of semantically defined data.
To improve this, I am all in favor of adding schema and editor-specific
information to templates. The most natural storage location for this
extra information would be directly in the documentation section of the
template it describes. This makes it easy to find and edit, and ensures
that the schema is copied along with the template. Some of this extra
information might even be usable to automatically add additional,
globally defined (
schema.org or similar) itemtypes to the rendered
output, which can make the information directly available to search
engines without any manual work on their part.
I also don't think that prefix matches on the itemtype instead of a full
string match are quite as hard or hacky as you make it out to be. Search
engines already routinely perform this in their crawlers to support
schema extensions:
http://schema.org/docs/extension.html.
A global itemtype hierarchy for templates could still be introduced
along with a central repository of generally useful and semantically
annotated templates. Something like
http://mediawiki.org/md/Transclusion/Cite maybe, with the option to
subclass as
http://mediawiki.org/md/Transclusion/Cite/en.wikipedia.org
if a local extension is needed.
For the editor project, we mainly need an efficient representation of
the needed information with minimal changes to the rendered output. Any
solution that requires us to add many additional elements will simply
not work for us. The exact itemtype URL used on the other hand is easily
adjusted if a useful global hierarchy emerges.
Gabriel