On 4/12/07, Marc Riddell michaeldavid86@comcast.net wrote:
on 4/12/07 9:58 AM, Peter Jacobi at peter_jacobi@gmx.net wrote:
Can't resist the urge...
If John Doe died from Lung Cancer, I would like to have his name appear in the Subcategory List of "Lung cancer deaths". I would also like to see his name appear in the Main Category List of "Cancer deaths".
And I would prefer to not have both categories. People (with few exceptions) are not known for dying from cancer and in 99% of all cases this fact is totally unrelated to the reasons they are known for.
This is not a categorization in the sense of a usefull navigation aid, or for forming a systematic hierarchy of articles. It is an exercise in knowledge representation and I fear it will only stop, when every sentence in every article is also represented by a category membership.
Regard, Peter
As along as we are being driven by urges :-):
If I am a researcher, and would like to know which persons in the encyclopedia died from cancer, how would I proceed to learn this?
Marc
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l
From the commons page, change "image" to "article" in each instance.
Over-categorization
Over-categorization is what happens when an image is placed in several categories within the same tree. The general rule is always place an image in the most specific categories, and not in the levels above those. An example:
We'll assume that yellow spheres are spheres with a yellow color. We can think about Category:Yellow spheres and Category:Spheres. The picture to be categorized shows yellow marbles. We categorize the file in Category:Yellow spheres. Now, if we also categorize the image file in Category:Spheres, this is over-categorization: because we already know that the yellow marbles are spheres. This applies to most images: As mentioned above files in Category:Paris should not also be in Category:France, files in Category:Albert Einstein should not be in Category:Physicists from Germany and so on.
[edit] Why is over-categorization a problem
It's often assumed that the more categories an image is in, the easier it will be to find it. Another example: By that logic, every image showing a man should be in Category:Men, because even if you know nothing more about the person you're looking for than that he is a man, you'll be able to find it. The result is that the top category fills up, making it necessary to go through hundreds, or in this case more likely thousands of images to find the one you want. You probably won't find what you're looking for, and what's more, those who are looking for a generic picture of a man to illustrate an article like en:Man will find that they've drowned out among the movie stars, scientists and politicians.
On lower levels, the problem becomes less acute, since the number of images will be smaller — they can still easily reach into the hundreds, though. But there is still a problem: Let's go back to Einstein. I know that he's a physicist, so I'll look there. I find an image among the hundreds in the category, which I'm not too happy with, but it's the only one there. Since there was an image there, I assume that there are no more hidden elsewhere, rather than look further in Category:Physicists from Germany and thus find Category:Albert Einstein where there might be a better one.
*** So over-categorization has led to two problems: The top category is cluttered, and users will stop looking for the most relevant category since they've reached one that has a relevant image. ***