On 12/08/07, Brianna Laugher brianna.laugher@gmail.com wrote:
On 13/08/07, Andrew Gray shimgray@gmail.com wrote:
b) is perhaps more interesting. In the bowels of whatever system you use for tagging, set it up so that one tag, one identifier, can be represented by many different tags. In effect, allow "cat" and "cats", but ensure a search for one displays the other as well. LibraryThing does this, and does it fairly well; their tag lists contain mispellings and foreign terms as well as variant names, which is quite useful. Configuring this and ensuring it doesn't get accidentally snarled up - inadvertently merging two large groups can be confusing - is tricky, but once it's up and running it should require less ongoing maintenance.
ok... this is interesting. I didn't know this. Don't suppose their SW happens to be open source? :)
I have no idea; I suspect not, since it's mostly homebrewed and they are selling it as a subscription service... but then, it might not be desperately much use to us anyway, given it's homebrewed and targeted specifically for running their site.
http://www.librarything.com/blog/2005/12/combining-tags-heresy.php is the original blog post discussing it; he notes Clay Shirky's objections to it.
[on 'film', 'cinema', 'movies'] "Those terms actually encode different things, and the assertion that restricting vocabularies improves signal assumes that that there's no signal in the difference itself, and no value in protecting the user from too many matches."
http://www.shirky.com/writings/ontology_overrated.html#mind_reading
This is a real issue, but I think - *think* - it's not one we need to be desperately paranoid about. The valuable semantic ambiguities lie in intangibles, conceptual things, which are generally not what we want to be tagging a photograph with! We're basically looking at things not ideas, and there's a much more solid mapping of terminology there.
(I really don't get the workings of the MediaWiki category system very well. Can we have 'redirect categories'? That might help)
http://www.librarything.com/tagcombine_log.php gives you an idea of the kind of tag merging going on.
The other problem which cannot be ignored in relation to tagging/categories is that people need to be able to use the equivalent tag in their own language, and have the tags show up in their own language, while still seeing all the files that have been tagged with the equivalent tag in other languages.
Yeah. Though... hmm. The LT model is that each tag is basically a redirect (to use our jargon) for the "main tag", which is determined by raw popularity; it presumably wouldn't be too difficult to have the main tag vary by language if you had some way of categorising the "minor tags" by source language.
The real killer here would *probably* be the multilingual homonym problem - orthographically identical words that mean different things in different languages; they'd get combined happily because as far as a monolingual speaker knew, they were unambiguous, and have... interesting... knock-on effects.