On 6/4/06, Roger Luethi collector@hellgate.ch wrote:
On Sun, 04 Jun 2006 11:31:02 -0400, Anthony DiPierro wrote:
Once the software is written to compute intersections of categories within the Mediawiki software, it would be relatively simple to recategorize the articles into their parent categories, such that no information was lost. The way this would be done is that all articles in a subcategory which had multiple parent attribute categories would be automatically moved into the parent categories. This would be repeated until no such situations continued to exist. The ad-hoc structure could still be kept, but it could be calculated on the fly (along with new types of intersections which could be easily added).
So instead of in the category "Bridges in France", the "Pont du Gard" would now be in the categories "Bridges" and "France"!?
No, it'd be in "Bridges" and "Buildings and structures in France".
(Now that I do this on an example, I see that this algorithm would probably have to be tweaked to deal with subcatgories of [[Category:Categories by topic]], but that's not too bad.)
I suspect there are more corner cases than we imagine.
I'm not sure, but I have downloaded the latest db to start playing with this.
For instance, how do you connect the districts of Paris to the category Paris? What is a subset of the parent attribute "Paris": "Districts of Paris", or "Quartier Latin", or neither? Does it bother you if the article on a French district is now in a subcategory of "Capitals in Europe"?
[[Category:Paris]] is a theme, not an attribute, so [[Category:Paris]] should not be a subcategory of [[Category:Capitals in Europe]].
One problem I'm seeing here is that your proposal focuses on one single type of hierarchies. Paris, France, Europe are now all themes. So if I ask for an intersection of categories "Bridges" and "France", the results include bridges that for any reason are related to the "France" theme. There is no way to ask for Bridges in France, or Paris.
We could always make "Buildings and structures in France" a subcategory of "Stuff in France". But you're right, it's not quite as clean as I had expected.
I wonder though if this is just another aspect of the [[Category:Categories by topic]] issue.
One solution would be to create "is part of" or "is in" categories in addition to the "is a" categories. Then you can have a hierarchy with "Pont du Gard" -> "France" -> "Europe", or "cylinder" -> "engine" -> "car".
We'd have two indepent types of categories, plus a third, the catch-all "theme".
female human beings and your example [[Category:Feminine hygiene]]? How about [[Category:Women's rights]]? Add an umbrella cat "Somehow related to women" maybe?
[[Category:Women]] could be a subcategory of [[Category:Woman]].
Heh. Looking at [[Category:Women]], it's now a subcat of both [[Category:People]] and [[Category:Humans]].
The latter should probably be [[Category:Humanity]] :).
Here's another fun fact. [[Category:Humans]] is a subcat of [[Category:People]]. Ick.
And now that I look at it, [[Category:Humans]] is also a subcat of [[Category:Apes]] (presumably a theme). 'Course that wouldn't have interested me if I was acquainted with the terminology that "humans" are "apes" :).
OK, this plural/singular thing probably isn't going to work.
mistakes all over again. The advantage of my proposal to not allow themes as subcategories of attributes is that it can be implemented today, without much disruption, and without modifying any code.
What I missed in your proposal is how you retain existing theme information. Duplicate categories in a theme namespace?
Roger
Yeah. So if [[Category:Humans]] was a theme and an attribute, we'd split it into [[Category:Humans]] (the attribute) and [[Category:Humanity]] (the theme), or something.
I suppose the plural/singular thing is still useful as a rough guideline. If things are kept separated it doesn't have to be followed strictly.
Anyway, I'm starting to think things are too far gone by now. Might as well just wait for the features first and then start reorganizing things. But I've got a little bit of hope, and my import script of the downloaded data is still running.
Anthony