As most of you know, the MediaWiki namespace is quickly becoming a generic template system with parameters. Currently it only supports unnamed parameters, e.g. {{country|Germany|82 million}}, but it is my understanding that named parameters, e.g. {{country|name=Germany|population=82 million}}, are firmly planned.
What this particular query would do is use the "Template:Country" page, which would be the box which is currently manually entered for each country article on Wikipedia, and replace certain variables in it ($name, $population) with the assigned parameters.
From named parameters it is a relatively small step to searchable
metadata.
I am thinking of a new table called METADATA
| template_name | parameter_name | parameter_value | page_id | |------------------------------------------------------------| | country | name | Germany | 23923 | | country | population | 82 million | 23923 | | country | name | United States | 4834 | | country | population | 290 million | 4834 |
What this means:
The page 23923 ([[Germany]]) includes the text {{country|name=Germany|population=82 million}}
The page 4834 ([[United States]]) includes the text {{country|name=United States|population=290 million}}
Upon saving a page, all template calls would be parsed, and the METADATA table would be updated accordingly.
Lastly, we would need a page Special:MetadataSearch, which would include a search form like this:
----------------------------------------- Template : [country.......^]
Parameter : [ ] Value : [equals ^] [ ] AND (*) OR ( ) AND NOT ( )
Parameter : [ ] Value : [equals ^] [ ] AND ( ) OR ( ) AND NOT ( )
etc.
ORDER BY : [numerical value ^] SORT ORDER: [ascending ^] -----------------------------------------
It would of course run the appropriate query on the METADATA table. In most cases I believe this would be fairly fast as most templates would only appear on hundreds of pages, limiting the search space dramatically.
Now we could do things like: * Show me all countries with a population larger than X * Show me the country with the top-level domain .FR * Show me all countries that use the US dollar as currency, but do not speak English as an official language etc., of course generically for all kinds of templates.
As an extra bonus, it would be neat to be able to transclude the results of these queries dynamically on a page, so I could have
[[List of countries by population]]
which would be dynamically generated from such a query.
Advantages of this system: * Metadata is entered naturally, immediately associated with visible output * Very high flexibility
We can theoretically extend this relatively easily to allow for invisible metadata that is not directly associated with a template:
{{data:person|firstname=Alfred|lastname=Wallace|birthyear=1823}}
Whether we want to do this is another question. I am leaning against it because I feel it violates basic wiki principles -- cause and effect should be visible to the user. But if we establish the metadata and article separation I alluded to in an earlier post we may reconsider this.
This proposal is directly relevant to the image copyright tagging problem, which can now be solved relatively easily:
{{ image license=[[GNU Free Documentation License]] | source=photograph taken by [[User:Dogmaster3000]] }}
This text could be auto-inserted into the page based on form data during upload (don't forget to update the METADATA table as well). Now we can query for images under a specific license, uploaded by a specific user etc.
There is of course some overlap with the category system here, but I feel it is minimal. On the other hand, I think that in terms of implementation, the strategies needed here are similar to those for keeping the LINKS tables up to date, so maybe a shared implementation makes sense.
Let me know what you think. I'd be especially interested in Tim's opinion as he is the main developer behind the template system.
I feel there is still a lot of potential in templates, and basic programmability should also be considered. Having stuff like optional parameters and loops would open a whole new range of possibilities. For the typical user, things will become easier, not harder, as they now just have to speak one syntax - {{template|foo=bar}} - and can use it to influence data in very complex dynamic layouts.
All best,
Erik
Erik Moeller wrote:
<<lots of excellent stuff>>
Erik:
This stuff is awesome. This is very similar to some work I have been doing for the Kendra project, which is a metadata project for linking everything to everything else using the Semantic Web, with a Wiki interface for "semantic boostrapping", and the short-term goal of providing a way of providing better sales channels for the indie music industry as a demonstrator.
I agree that every piece of markup should have a visible display, so that there is "closure" from a user-interface point of view. Templates are the obvious way of doing this, and they also provide a natural place to add the metadata that describes the "class" (in an OWL./RDF sense) that the template refers to.
One way of getting around the "invisible" markup problem would be to add a visible hover-box which would only appear over a piece of text when the user hovered over it, and to use a very subtle visuai hint to show that this was available.
Another is to generate text from the template, so that
{{data:person|firstname=Alfred|surname=Wallace|birthyear=1823}}
would actually have the same display effect as
[[Alfred Wallace]], (born [[1823]])
To do this right, some forms of "smart" template should be written as PHP code, not as Wiki templates. The person template is a really good example: it could handle special cases such as display of names for cultures which put the personal and family names in the order "FAMILY personal", and also help sort personal names correctly by surname, rather than by ASCII collating order. However, 99% of templates can be done right just with macro generation.
-- Neil
Erik Moeller wrote:
As most of you know, the MediaWiki namespace is quickly becoming a generic template system with parameters. Currently it only supports unnamed parameters, e.g. {{country|Germany|82 million}}, but it is my understanding that named parameters, e.g. {{country|name=Germany|population=82 million}}, are firmly planned.
All your ideas are pretty impressive, and I firmly support your implementation plan.
However, I would like to call for some thought on the syntax. I think using the "=" and "|" characters alone to separate things might be problematic, because both of these characters should be allowed to occur within a variable's value, too.
Admittedly, the "=" isn't too much of a problem, because if you have something like, {{stuff|thing=E=mc²}}, it is easy to say that the first "=" character separates the variable name from the value, and all other "="s are part of the value. However, with the |, that's not quite so easy. Maybe, analogous to {{ ... }}, we should have ||? (Yes, I know, table syntax uses it too, and maybe we sometimes want a table inside a variable's value, but since tables are clearly surrounded by {| ... |}, the parser shouldn't have any problems knowing which "||" is which.
For readability, spaces around the "=" should be allowed.
Thus, my suggested syntax would be:
{{image || license = [[GNU Free Documentation License|GNU FDL]] || source = photograph taken by [[User:Dogmaster3000|]] }}
Rock on! Timwi
Timwi-
However, I would like to call for some thought on the syntax.
Agreed. The syntax needs to be unambiguous. Two pipes would probably do the trick, although the similarity to the table syntax could cause real confusion, especially when the output is a table.
Maybe a smarter syntax could be used that uses newlines intelligently:
{{ country name => Germany population => 82 million
''(as of 2004)'' }}
Here a line that begins with a valid variable name followed by a => is interpreted as the key, and everything that follows as the value.
The double-pipes could be used for a single-line notation, e.g.
{{ country || name => Germany || population => 82 million }}
Regards,
Erik
Using white space as a parsing token seems like a bad idea to me. Isn't that why people hate "make" because of the requirement for the leading tab?
-Kelly
At 12:39 PM 3/20/2004, you wrote:
Maybe a smarter syntax could be used that uses newlines intelligently:
{{ country name => Germany population => 82 million
''(as of 2004)'' }}
On Mar 21, 2004, at 05:50, Kelly Anderson wrote:
Using white space as a parsing token seems like a bad idea to me. Isn't that why people hate "make" because of the requirement for the leading tab?
People hate "make" because of the leading *tab*, which can't be told apart from spaces at a glance and is automatically turned into spaces by various operations.
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org