On Sat, Jan 16, 2010 at 10:07 PM, Jesse (Pathoschild) pathoschild@gmail.com wrote:
Unfortunately, categories and database queries are inadequate for our needs. Someone can indeed navigate to Categories::Works::Works by genre::Non-fiction::Governmental::Biographies::Ancient biographies, and they'll find all 5 pages that someone thought to categorize to this depth. But if someone hopes to find our 1872 American biographies, they are going to be sorely disappointed.
You can do this with database queries fine -- there are already several different toolserver tools that will do category intersections for you, and a couple extensions. In fact, bog-standard search will do it for you, although AFAIK only for categories added literally (not by templates):
http://en.wikipedia.org/w/index.php?title=Special:Search&redirs=1&se...
It wouldn't be that hard to allow template-added categories too. I assume you have categories like "books published in America", "books published in 1872", and "biographies" -- if not, you can easily add them via your templates (although that wouldn't work right now with standard search AFAIK, it would work with things like CatScan).
If we simply extend MediaWiki to support metadata for works or authors, the metadata is limited to these types and fields. Public metadata can be extended and parsed in any way the local community or our content users feel useful.
Sure, but this is not internal use, so not relevant to my last post.
This is also not possible with database queries, since the metadata is not provided to the software except as part of the wiki text.
It is if you use categories. It would also be possible to hack up some tool to store all template parameter-value pairs, which are strikingly similar to the idea of RDFa triples: (article, template+parameter name, parameter value).
There is very little difference between internal and external use; it's no easier for a Wikisource editor to find those 1872 American biographies. Editors are also users.
By "internal use" I mean "use by software designed only to work with MediaWiki", not "use by Wikimedia users". Standards are only needed if we want to be useful to software that's also meant to work with other sites. That way, the software can use the same code to process both our site and the other sites, since all output the same standard markup. If the software is only processing MediaWiki sites to begin with, then standard markup is useless. (Unless it happens to expose convenient libraries, like with XML or such -- but that's probably not the case here.)
So, these metadata formats are definitely *not* useless for internal community use.
No, they really are. It's almost certainly more work for us to use a standard of any kind than to make up our own internal format, so if we only care about internal use, bothering with standards is counterproductive. The real use-cases are for external users only.