On 4/12/07, Marc Riddell <michaeldavid86(a)comcast.net> wrote:
on 4/12/07 9:58 AM, Peter Jacobi at
peter_jacobi(a)gmx.net wrote:
Can't resist the urge...
If John Doe died from Lung Cancer, I would like
to have
his name appear in the Subcategory List of "Lung cancer deaths".
I would also like to see his name appear in the Main Category List
of "Cancer deaths".
And I would prefer to not have both categories. People (with
few exceptions) are not known for dying from cancer and in
99% of all cases this fact is totally unrelated to the
reasons they are known for.
This is not a categorization in the sense of a usefull
navigation aid, or for forming a systematic hierarchy
of articles. It is an exercise in knowledge representation
and I fear it will only stop, when every sentence in
every article is also represented by a category membership.
Regard,
Peter
As along as we are being driven by urges :-):
If I am a researcher, and would like to know which persons in the
encyclopedia died from cancer, how would I proceed to learn this?
Marc
_______________________________________________
WikiEN-l mailing list
WikiEN-l(a)lists.wikimedia.org
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
From the commons page, change "image" to
"article" in each instance.
Over-categorization
Over-categorization is what happens when an image is placed in several
categories within the same tree. The general rule is always place an
image in the most specific categories, and not in the levels above
those. An example:
We'll assume that yellow spheres are spheres with a yellow color. We
can think about Category:Yellow spheres and Category:Spheres. The
picture to be categorized shows yellow marbles. We categorize the file
in Category:Yellow spheres. Now, if we also categorize the image file
in Category:Spheres, this is over-categorization: because we already
know that the yellow marbles are spheres. This applies to most images:
As mentioned above files in Category:Paris should not also be in
Category:France, files in Category:Albert Einstein should not be in
Category:Physicists from Germany and so on.
[edit] Why is over-categorization a problem
It's often assumed that the more categories an image is in, the easier
it will be to find it. Another example: By that logic, every image
showing a man should be in Category:Men, because even if you know
nothing more about the person you're looking for than that he is a
man, you'll be able to find it. The result is that the top category
fills up, making it necessary to go through hundreds, or in this case
more likely thousands of images to find the one you want. You probably
won't find what you're looking for, and what's more, those who are
looking for a generic picture of a man to illustrate an article like
en:Man will find that they've drowned out among the movie stars,
scientists and politicians.
On lower levels, the problem becomes less acute, since the number of
images will be smaller — they can still easily reach into the
hundreds, though. But there is still a problem: Let's go back to
Einstein. I know that he's a physicist, so I'll look there. I find an
image among the hundreds in the category, which I'm not too happy
with, but it's the only one there. Since there was an image there, I
assume that there are no more hidden elsewhere, rather than look
further in Category:Physicists from Germany and thus find
Category:Albert Einstein where there might be a better one.
*** So over-categorization has led to two problems: The top category
is cluttered, and users will stop looking for the most relevant
category since they've reached one that has a relevant image. ***