[WikiEN-l] Types of categories

Anthony DiPierro wikilegal at inbox.org
Sun Jun 4 23:17:27 UTC 2006


On 6/4/06, Roger Luethi <collector at hellgate.ch> wrote:
> On Sun, 04 Jun 2006 11:31:02 -0400, Anthony DiPierro wrote:
> > Once the software is written to compute intersections of categories
> > within the Mediawiki software, it would be relatively simple to
> > recategorize the articles into their parent categories, such that no
> > information was lost.  The way this would be done is that all articles
> > in a subcategory which had multiple parent attribute categories would
> > be automatically moved into the parent categories.  This would be
> > repeated until no such situations continued to exist.  The ad-hoc
> > structure could still be kept, but it could be calculated on the fly
> > (along with new types of intersections which could be easily added).
>
> So instead of in the category "Bridges in France", the "Pont du Gard" would
> now be in the categories "Bridges" and "France"!?
>
No, it'd be in "Bridges" and "Buildings and structures in France".

> > (Now that I do this on an example, I see that this algorithm would
> > probably have to be tweaked to deal with subcatgories of
> > [[Category:Categories by topic]], but that's not too bad.)
>
> I suspect there are more corner cases than we imagine.
>
I'm not sure, but I have downloaded the latest db to start playing with this.

> > > For instance, how do you connect the districts of Paris to the category
> > > Paris? What is a subset of the parent attribute "Paris": "Districts of
> > > Paris", or "Quartier Latin", or neither? Does it bother you if the article
> > > on a French district is now in a subcategory of "Capitals in Europe"?
> > >
> > [[Category:Paris]] is a theme, not an attribute, so [[Category:Paris]]
> > should not be a subcategory of [[Category:Capitals in Europe]].
>
> One problem I'm seeing here is that your proposal focuses on one single
> type of hierarchies. Paris, France, Europe are now all themes. So if I
> ask for an intersection of categories "Bridges" and "France", the results
> include bridges that for any reason are related to the "France" theme.
> There is no way to ask for Bridges in France, or Paris.
>
We could always make "Buildings and structures in France" a
subcategory of "Stuff in France".  But you're right, it's not quite as
clean as I had expected.

I wonder though if this is just another aspect of the
[[Category:Categories by topic]] issue.

> One solution would be to create "is part of" or "is in" categories in
> addition to the "is a" categories. Then you can have a hierarchy with "Pont
> du Gard" -> "France" -> "Europe", or "cylinder" -> "engine" -> "car".
>
> We'd have two indepent types of categories, plus a third, the catch-all
> "theme".
>
> > > female human beings and your example [[Category:Feminine hygiene]]? How
> > > about [[Category:Women's rights]]? Add an umbrella cat "Somehow related to
> > > women" maybe?
> >
> > [[Category:Women]] could be a subcategory of [[Category:Woman]].
>
> Heh. Looking at [[Category:Women]], it's now a subcat of both
> [[Category:People]] and [[Category:Humans]].
>
The latter should probably be [[Category:Humanity]] :).

Here's another fun fact.  [[Category:Humans]] is a subcat of
[[Category:People]].  Ick.

And now that I look at it, [[Category:Humans]] is also a subcat of
[[Category:Apes]] (presumably a theme).  'Course that wouldn't have
interested me if I was acquainted with the terminology that "humans"
are "apes" :).

OK, this plural/singular thing probably isn't going to work.

> > mistakes all over again.  The advantage of my proposal to not allow
> > themes as subcategories of attributes is that it can be implemented
> > today, without much disruption, and without modifying any code.
>
> What I missed in your proposal is how you retain existing theme
> information. Duplicate categories in a theme namespace?
>
> Roger

Yeah.  So if [[Category:Humans]] was a theme and an attribute, we'd
split it into [[Category:Humans]] (the attribute) and
[[Category:Humanity]] (the theme), or something.

I suppose the plural/singular thing is still useful as a rough
guideline.  If things are kept separated it doesn't have to be
followed strictly.

Anyway, I'm starting to think things are too far gone by now.  Might
as well just wait for the features first and then start reorganizing
things.  But I've got a little bit of hope, and my import script of
the downloaded data is still running.

Anthony



More information about the WikiEN-l mailing list