On 9/6/07, Platonides Platonides@gmail.com wrote:
Agree. We don't need to have ONE template on ALL images, we can (and should) have a number of templates, as long as it's documented. Ie. we have a page listing all of "valid" templates and describing its arguments. If a bot knows that Information_Louvre->source is equivalent to Information->Author it can happily work with any of them being present. Just keep it documented (and a working parsing implementation).
Another example are PD books templates. They have everything about the image "Page X from book Y, by Foo on Year on public domain". Here the source & author values for the template would be hardcoded.
The problem that comes up is that people just constantly invent new templates often with trivial differences like hard-coded sources, authorship, or licensing information. These are especially bad cases because when it's stuffed into the template it is as though it isn't provided at all.. until someone goes through and special-cases that template. Eventually we'll end up with 10million images and 1 million templates, one for each source.. just because our uploading tools suck and people are abusing templates to avoid retyping source or licensing info. :-/
It's utterly unacceptable to expect any tools to keep up with that.
Most of the fields in information are common to virtually every image why should someone have to support 40 different ways of reading the same three or four basic pieces of information which are common to all images? Why should the same basic three or four fields have a different presentation randomly on some images?
It would be better to add lots of optional arguments to information.. or offer secondary additional information templates which have less uniformity but more flexibility.
"Gregory Maxwell" gmaxwell@gmail.com wrote on Thu, 6 Sep 2007 17:09:18 -0400:
Most of the fields in information are common to virtually every image why should someone have to support 40 different ways of reading the same three or four basic pieces of information which are common to all images? Why should the same basic three or four fields have a different presentation randomly on some images?
It would be better to add lots of optional arguments to information.. or offer secondary additional information templates which have less uniformity but more flexibility.
That would surely be a way to get rid of/improve the louvre template or make it subst'able.
Regards,
Flo
Gregory Maxwell wrote:
On 9/6/07, Platonides wrote:
Agree. We don't need to have ONE template on ALL images, we can (and should) have a number of templates, as long as it's documented. Ie. we have a page listing all of "valid" templates and describing its arguments. If a bot knows that Information_Louvre->source is equivalent to Information->Author it can happily work with any of them being present. Just keep it documented (and a working parsing implementation).
Another example are PD books templates. They have everything about the image "Page X from book Y, by Foo on Year on public domain". Here the source & author values for the template would be hardcoded.
The problem that comes up is that people just constantly invent new templates often with trivial differences like hard-coded sources, authorship, or licensing information.
If changes are trivial, they should be merged.
These are especially bad cases because when it's stuffed into the template it is as though it isn't provided at all.. until someone goes through and special-cases that template.
The bots can alerts us of that.
Eventually we'll end up with 10million images and 1 million templates, one for each source.. just because our uploading tools suck and people are abusing templates to avoid retyping source or licensing info. :-/
"You can't use this home-made template, as it's not listed on [[Commons:The_ultimate_information]]. Also if you had gone to add it there you would have found there're already 3 templates using the same, Evil-bot-which-dislikes-templates is substituting it. Have a nice day."
It's utterly unacceptable to expect any tools to keep up with that.
It's unacceptable to expect *all* tools to keep up. But a working framework could be provided ;)
The way I see it, there are three possible ways for a bot to get meta information about an image from a template:
- From the wiki text
- From the rendered HTML
- From some future to-be-automatically-generated
page:template:variable_key:value data set
#1 is hard/impossible to do correctly (though it might work in many cases), as only the MediaWiki parser can parse this stuff correctly (mor or less...). #2 is correct (since it was done by the MediaWiki parser), but slow. #3 IMHO is the only long-term solution. I have proposed this several times, on several lists. Last thing I heard, semantic wikipedia will take care of it. As soon as it get installed, on Commons...
You're right about #1. But we don't need a full parser, only a basic one. More or less like braceSubstitution, omitting al formatting (maybe not completely ignore wikilinks). #2 helps with templates including other templates, but you need to tag the sections with html classes (couldn't we have another xml namespace added for this?). It's slow.
Most of the fields in information are common to virtually every image why should someone have to support 40 different ways of reading the same three or four basic pieces of information which are common to all images? Why should the same basic three or four fields have a different presentation randomly on some images?
Ideally, that page would be in a meta-language allowing the bots to learn what the template arguments are before starting to parse. In the short term, the "translation" would be manual and hardcoded.
The first doubt it comes to me is. What are the basic fields needed?