As most of you know, the MediaWiki namespace is quickly becoming a generic
template system with parameters. Currently it only supports unnamed
parameters, e.g. {{country|Germany|82 million}}, but it is my
understanding that named parameters, e.g.
{{country|name=Germany|population=82 million}}, are firmly planned.
What this particular query would do is use the "Template:Country" page,
which would be the box which is currently manually entered for each
country article on Wikipedia, and replace certain variables in it ($name,
$population) with the assigned parameters.
From named parameters it is a relatively small step to
searchable
metadata.
I am thinking of a new table called METADATA
| template_name | parameter_name | parameter_value | page_id |
|------------------------------------------------------------|
| country | name | Germany | 23923 |
| country | population | 82 million | 23923 |
| country | name | United States | 4834 |
| country | population | 290 million | 4834 |
What this means:
The page 23923 ([[Germany]]) includes the text
{{country|name=Germany|population=82 million}}
The page 4834 ([[United States]]) includes the text
{{country|name=United States|population=290 million}}
Upon saving a page, all template calls would be parsed, and the METADATA
table would be updated accordingly.
Lastly, we would need a page Special:MetadataSearch, which would include a
search form like this:
-----------------------------------------
Template : [country.......^]
Parameter : [ ]
Value : [equals ^] [ ]
AND (*) OR ( ) AND NOT ( )
Parameter : [ ]
Value : [equals ^] [ ]
AND ( ) OR ( ) AND NOT ( )
etc.
ORDER BY : [numerical value ^]
SORT ORDER: [ascending ^]
-----------------------------------------
It would of course run the appropriate query on the METADATA table. In
most cases I believe this would be fairly fast as most templates would
only appear on hundreds of pages, limiting the search space dramatically.
Now we could do things like:
* Show me all countries with a population larger than X
* Show me the country with the top-level domain .FR
* Show me all countries that use the US dollar as currency, but do not
speak English as an official language
etc., of course generically for all kinds of templates.
As an extra bonus, it would be neat to be able to transclude the results
of these queries dynamically on a page, so I could have
[[List of countries by population]]
which would be dynamically generated from such a query.
Advantages of this system:
* Metadata is entered naturally, immediately associated with visible
output
* Very high flexibility
We can theoretically extend this relatively easily to allow for invisible
metadata that is not directly associated with a template:
{{data:person|firstname=Alfred|lastname=Wallace|birthyear=1823}}
Whether we want to do this is another question. I am leaning against it
because I feel it violates basic wiki principles -- cause and effect
should be visible to the user. But if we establish the metadata and
article separation I alluded to in an earlier post we may reconsider this.
This proposal is directly relevant to the image copyright tagging problem,
which can now be solved relatively easily:
{{
image
license=[[GNU Free Documentation License]] |
source=photograph taken by [[User:Dogmaster3000]]
}}
This text could be auto-inserted into the page based on form data during
upload (don't forget to update the METADATA table as well). Now we can
query for images under a specific license, uploaded by a specific user
etc.
There is of course some overlap with the category system here, but I feel
it is minimal. On the other hand, I think that in terms of implementation,
the strategies needed here are similar to those for keeping the LINKS
tables up to date, so maybe a shared implementation makes sense.
Let me know what you think. I'd be especially interested in Tim's opinion
as he is the main developer behind the template system.
I feel there is still a lot of potential in templates, and basic
programmability should also be considered. Having stuff like optional
parameters and loops would open a whole new range of possibilities. For
the typical user, things will become easier, not harder, as they now just
have to speak one syntax - {{template|foo=bar}} - and can use it to
influence data in very complex dynamic layouts.
All best,
Erik