No subject


Sun Jan 7 16:43:09 UTC 2007


Over-categorization

Over-categorization is what happens when an image is placed in several
categories within the same tree. The general rule is always place an
image in the most specific categories, and not in the levels above
those. An example:

We'll assume that yellow spheres are spheres with a yellow color. We
can think about Category:Yellow spheres and Category:Spheres. The
picture to be categorized shows yellow marbles. We categorize the file
in Category:Yellow spheres. Now, if we also categorize the image file
in Category:Spheres, this is over-categorization: because we already
know that the yellow marbles are spheres. This applies to most images:
As mentioned above files in Category:Paris should not also be in
Category:France, files in Category:Albert Einstein should not be in
Category:Physicists from Germany and so on.

[edit] Why is over-categorization a problem

It's often assumed that the more categories an image is in, the easier
it will be to find it. Another example: By that logic, every image
showing a man should be in Category:Men, because even if you know
nothing more about the person you're looking for than that he is a
man, you'll be able to find it. The result is that the top category
fills up, making it necessary to go through hundreds, or in this case
more likely thousands of images to find the one you want. You probably
won't find what you're looking for, and what's more, those who are
looking for a generic picture of a man to illustrate an article like
en:Man will find that they've drowned out among the movie stars,
scientists and politicians.

On lower levels, the problem becomes less acute, since the number of
images will be smaller =97 they can still easily reach into the
hundreds, though. But there is still a problem: Let's go back to
Einstein. I know that he's a physicist, so I'll look there. I find an
image among the hundreds in the category, which I'm not too happy
with, but it's the only one there. Since there was an image there, I
assume that there are no more hidden elsewhere, rather than look
further in Category:Physicists from Germany and thus find
Category:Albert Einstein where there might be a better one.

*** So over-categorization has led to two problems: The top category
is cluttered, and users will stop looking for the most relevant
category since they've reached one that has a relevant image. ***



More information about the WikiEN-l mailing list