Roger Luethi wrote:
On Sun, 04 Jun 2006 11:31:02 -0400, Anthony DiPierro wrote:
Once the software is written to compute intersections of categories within the Mediawiki software, it would be relatively simple to recategorize the articles into their parent categories, such that no information was lost. The way this would be done is that all articles in a subcategory which had multiple parent attribute categories would be automatically moved into the parent categories. This would be repeated until no such situations continued to exist. The ad-hoc structure could still be kept, but it could be calculated on the fly (along with new types of intersections which could be easily added).
So instead of in the category "Bridges in France", the "Pont du Gard" would now be in the categories "Bridges" and "France"!?
There's no benefit from this unless we have a good search system.
(Now that I do this on an example, I see that this algorithm would probably have to be tweaked to deal with subcatgories of [[Category:Categories by topic]], but that's not too bad.)
I suspect there are more corner cases than we imagine.
Absolutely. See http://en.wiktionary.org/wiki/Wiktionary:Semantic_relations , and this could be taken even further.
For instance, how do you connect the districts of Paris to the category Paris? What is a subset of the parent attribute "Paris": "Districts of Paris", or "Quartier Latin", or neither? Does it bother you if the article on a French district is now in a subcategory of "Capitals in Europe"?
[[Category:Paris]] is a theme, not an attribute, so [[Category:Paris]] should not be a subcategory of [[Category:Capitals in Europe]].
One problem I'm seeing here is that your proposal focuses on one single type of hierarchies. Paris, France, Europe are now all themes. So if I ask for an intersection of categories "Bridges" and "France", the results include bridges that for any reason are related to the "France" theme. There is no way to ask for Bridges in France, or Paris.
One solution would be to create "is part of" or "is in" categories in addition to the "is a" categories. Then you can have a hierarchy with "Pont du Gard" -> "France" -> "Europe", or "cylinder" -> "engine" -> "car".
We'd have two indepent types of categories, plus a third, the catch-all "theme".
That still doesn't disambiguate "cylinder" -> "geometric solid", or "engine" -> "aircraft", or "engine" -> "railway"
Ec