On 6/4/06, Roger Luethi <collector(a)hellgate.ch> wrote:
On Sun, 04 Jun 2006 11:31:02 -0400, Anthony DiPierro
wrote:
Once the software is written to compute
intersections of categories
within the Mediawiki software, it would be relatively simple to
recategorize the articles into their parent categories, such that no
information was lost. The way this would be done is that all articles
in a subcategory which had multiple parent attribute categories would
be automatically moved into the parent categories. This would be
repeated until no such situations continued to exist. The ad-hoc
structure could still be kept, but it could be calculated on the fly
(along with new types of intersections which could be easily added).
So instead of in the category "Bridges in France", the "Pont du Gard"
would
now be in the categories "Bridges" and "France"!?
No, it'd be in "Bridges" and "Buildings and structures in
France".
(Now that I do
this on an example, I see that this algorithm would
probably have to be tweaked to deal with subcatgories of
[[Category:Categories by topic]], but that's not too bad.)
I suspect there are more corner cases than we imagine.
I'm not sure, but I have downloaded the latest db to start playing with this.
For instance, how do you connect the districts of
Paris to the category
Paris? What is a subset of the parent attribute "Paris": "Districts of
Paris", or "Quartier Latin", or neither? Does it bother you if the
article
on a French district is now in a subcategory of "Capitals in Europe"?
[[Category:Paris]] is a theme, not an attribute, so [[Category:Paris]]
should not be a subcategory of [[Category:Capitals in Europe]].
One problem I'm seeing here is that your proposal focuses on one single
type of hierarchies. Paris, France, Europe are now all themes. So if I
ask for an intersection of categories "Bridges" and "France", the
results
include bridges that for any reason are related to the "France" theme.
There is no way to ask for Bridges in France, or Paris.
We could always make "Buildings and structures in France" a
subcategory of "Stuff in France". But you're right, it's not quite as
clean as I had expected.
I wonder though if this is just another aspect of the
[[Category:Categories by topic]] issue.
One solution would be to create "is part of"
or "is in" categories in
addition to the "is a" categories. Then you can have a hierarchy with
"Pont
du Gard" -> "France" -> "Europe", or "cylinder"
-> "engine" -> "car".
We'd have two indepent types of categories, plus a third, the catch-all
"theme".
female
human beings and your example [[Category:Feminine hygiene]]? How
about [[Category:Women's rights]]? Add an umbrella cat "Somehow related to
women" maybe?
[[Category:Women]] could be a subcategory of [[Category:Woman]].
Heh. Looking at [[Category:Women]], it's now a subcat of both
[[Category:People]] and [[Category:Humans]].
The latter should probably be [[Category:Humanity]] :).
Here's another fun fact. [[Category:Humans]] is a subcat of
[[Category:People]]. Ick.
And now that I look at it, [[Category:Humans]] is also a subcat of
[[Category:Apes]] (presumably a theme). 'Course that wouldn't have
interested me if I was acquainted with the terminology that "humans"
are "apes" :).
OK, this plural/singular thing probably isn't going to work.
mistakes all
over again. The advantage of my proposal to not allow
themes as subcategories of attributes is that it can be implemented
today, without much disruption, and without modifying any code.
What I missed in your proposal is how you retain existing theme
information. Duplicate categories in a theme namespace?
Roger
Yeah. So if [[Category:Humans]] was a theme and an attribute, we'd
split it into [[Category:Humans]] (the attribute) and
[[Category:Humanity]] (the theme), or something.
I suppose the plural/singular thing is still useful as a rough
guideline. If things are kept separated it doesn't have to be
followed strictly.
Anyway, I'm starting to think things are too far gone by now. Might
as well just wait for the features first and then start reorganizing
things. But I've got a little bit of hope, and my import script of
the downloaded data is still running.
Anthony