[WikiEN-l] Types of categories

Steve Bennett stevagewp at gmail.com
Sat Jun 3 17:54:27 UTC 2006


I'm probably not the only one who envisages all the wonderful things
that could be done with this massive collection of information that is
Wikipedia, *if only* we could do something clever with the categories.
And then you realise that you can't really do anything clever because
"category" has all sorts of different meanings to different people.

So far I have identified four rough types of categories. I'll invent
the notion a(X) to mean that article X is in category a. a(b(X)) means
that a is a subcategory of b, and X is in b.

Taxonomies: Tend to end in "s" and satisfy the rule that "If a(X) then
X is an a") is a logical sentence. Tend to form strict hierarchies,
where if a(X) and b(a), then it's perfectly natural and normal that
b(a(X)). Eg, Bridges in France is a subcat of Bridges, and every entry
in "bridges in France" is definitely a Bridge.  It's rare for an
article to be in more than two taxonomic categories at once.

Themes: Tend not to be plurals, and tend not to form strict
hierarchies. Often it is the case that b looks like it belongs in a,
but then a(b(X)) is nonsense for certain X. Eg, Paris might be in
European cities, and the film Amelie might be in Paris, but it's silly
to say that Amelie is in European cities. (or many worse examples)

Attributes: The category exists to denote some very specific small
detail of a subject, such that it would be conceivable to have dozens
or more such categories on an article. Examples: 1943 deaths, Living
persons, Winners of Nobel Peace Prize, etc. These tend to hierarchies
that start strict then end up fuzzy. Eg, 1943 deaths is only in 1943
and "1940s deaths", and these have parent categories of
"1940s","Years" and so forth, eventually ending up in "History",
whereupon things become chaos.

Meta-attributes: These are categories about *articles* rather than
article subjects. The most common examples are stubs ("France
geography stubs"), sources ("1911 Encyclopaedia Britannica") and
disputes of various kinds ("Articles lacking sources").

To me, these types of categories are all fairly incompatible, and
really get in the way of using categories to do anything useful. It's
pointless trying to draw tree structures when you have attributes and
meta-attributes involved, for example.

So my questions are these:
*Can anyone think of other types of categories I might have missed?
*How could Wikipedia be better if this general problem was addressed?
*How could this problem be addressed?

Steve



More information about the WikiEN-l mailing list