On 6/6/06, Roger Luethi collector@hellgate.ch wrote:
You ask some good questions here!
Right. But more importantly, the way categories are used is thorougly entrenched and I suspect your chances to change it are close to zero even assuming you could make it a policy.
It depends whether it can be done in an evolutionary rather than revolutionary way. Slowly migrating hardcoded "Polish scientist" categories to some more advanced method should work - once critical mass is achieved (assuming that people can be convinced that it's a better method), it could become the dominant method. Suddenly wading in and reorganising category hierarchies is probably doomed, otoh.
It would be easier to create clean trees in a parallel namespace. Say, leave [[Category:Bridges]] alone and create [[is a:Bridge]] (or, without software changes, [[Category:is a Bridge]]).
Or even {{isa|bridge}} ? Can anything useful be done with templates' "what links here"?
That said, it seems that you are overestimating the importance of one type of relationships at the expense of others.
I think I agree. Let me try something: say we make a shallow "taxonomy" tree (or not even tree?) and allow attributes instead to be hierarchical:
Paris "isa city" +in France ("isa city" is the taxonomy, +in France is the attribute) Now, +in France can be a subattribute of +in Europe (and it could have been made +in Ile de France or whatever)
Britney Spears "isa person" +female +singer +alive +singer could be a subattribute of +entertainer
Pont du Gard "isa aqueduct" +in France, +Roman-built "isa aqueduct" can be a subcategory of "isa bridge" and "isa construction"
This seems to be relatively clean, despite the fact that the attribute hierarchies have different meanings: "in" as opposed to "is a specialisation of".
Basically what I'm proposing now is keeping taxonomies quite strict, and allowing greater flexibility in attributes. So we'll always know whether an item is a soccer team or a city, but we may lose information on the finer details if the attributes aren't managed carefully. Still a better situation than currently not being able to distinguish between a rock band and a person..
More examples to try and break things?
Some hierarchies are perfectly natural and useful but are not "is a" relationships (Europe - France - Paris, Family - Genus - Species).
I don't quite understnd your second example. "Rattus rattus" "is a" "Rattus" is ok isn't it?
Many attributes are perfectly natural and useful but they tend not to fit hierarchies well. You starting using them as soon as you sketched out the supposedly taxonomic category women. There simply is no natural taxonomic hierarchy for women, just a bunch of attributes.
Yeah, I see that. So, we stop the taxonomy at "person", and instead have hierarchical attributes? Is this actually far from the current situation? Hmm.
Now _if_ you want to draw a natural hierarchy with women in it, try genealogy. But guess what? That's another type of relationship that we can't deal with (X is ancester of Y, Y is ancestor of Z).
I think overall, having objects in a hierarchy is not the goal in itself - the goal is organising information, being able to group related information, and being able to make meaningful statements such as "43% of our articles are about people".
Agreed. But the long-term goal should be for "Made in US" to be dynamically generated. It's just a bunch of relationships ("made in", "died in") and a list of attributes -- hierarchical even, in this case ("New York", "US", "North America"). You can have all kinds of fun with that, until someone adds a relationship like "was named after" and your software concludes that if people named after London are also named after Great Britain :-).
Heh, yeah attention has to be paid to what meaning can be extrapolated from a supercategory/superattribute relationship.
But you have to deal with them anyway. Your suggested something like this:
women *real women *-living women *-dead women *fictional women
You _are_ using attributes here. So what if I'm looking for the biography of a female Polish chemist but don't know whether that woman is still alive? Do I have to check both categories, or do we maitain trees for every possible order of attributes (which is pretty much what we are doing right now, manually)?
Yeah, that doesn't work well. Better to use semantic attributes, possibly with antonym relationships built in (not sure of the immediate use, but it's probably helpful to distinguish between living/not living/unknown. So, to look for your female polish chemist, you simply look for person (or possibly, chemist), +female +polish.
Where's {{Category:Polish chemists}} coming from? Defined on a separate
Defined on a separate page by someone who thought it was a meaningful and useful category, and worth spending 2 minutes making.
page? And do we also add {{Category:Female chemists}} and {{Category:Polish physicists}} and {{Category:Polish women}} to the same article?
You could, and the software (small matter of programming) would be smart enough to take the superset of all these things:
Person +chemist +Polish Person +female +chemist Person +physicist +Polish Person +female +Polish
Net result: Person +chemist +physicist +female +Polish
Alternatively if you knew the attributes directly you could just do {{Polish chemists}} +female +physicist
Steve