How about this: [[Category:1929 births]] [[Category:German authors]] [[Category:Women]]
Hi all, this is my first mail to the list,
Firstly, a small gripe of mine: The current category system does not recursively list pages. Consider a Category:Cats, with two subcategories; Category:WhiteCats and Category:BlackCats. If I go the the Category:Cats page, I am not shown any pages that are part of Category:WhiteCats or Category:BlackCats. Also, someone could make Category:BlackCats have a subcategory Category:Cats. This creates a loop in the category hierarchy. Cats -> BlackCats -> Cats. In my opinion: - When displaying Category:Cats, all members of this category and subcategories should be displayed to the user. - If a user adds a subcategory to a category, wikipedia should check for loops as a post-condition of the edit, and reject the edit if it does create a loop.
Now, This thread is about adding meta data (data about the data) to pages. Broadly speaking, there seem to be two suggestions put forward here to accomplish the same goal.
1. To use better categories to store data about page. 2. To invent a new, separate way of classifying pages.
There are benefits and drawbacks to associated with each suggestion:
1. To use better categories to store data about page:
This method does not store any information about the links between data. For example, there are subtle semantic differences between defining a person as Category:Irish and defining a ship as Category:Irish.
2. To invent a new, separate way of classifying pages:
This way of adding meta data to pages would be custom designed to achieve the goal, and would be better than the current category system. However, new code would have to be written to add this, which would take time, labour, money and introduce bugs that would need to be fixed, costing more time, labour and money. The current Category system exists, and works. I therefore feel that it would be a better decision to augment the current category system, than to create a second system that shares a lot of common features with the category system. Also, most of the pages in Wikipedia currently have categories, and if a new system was created, all the existing pages would have to be tagged (mostly by hand) for the new system.
If I were in charge of Wikipedia, I would have an overhaul of the current categories, removing the very specific categories, and adding the ability to perform the basic set of set operations to categories.
For example: Currently these exists a Category:Irish_Poets. I would delete this category and add all pages in it to Category:Irish and Category:Poet. If a user wants list of Irish Poets, he would ask wikipedia for (Irish) Intersection (Poets).
Also I would create a category Person. All people in Wikipedia would be part of this category. Also I would create Category:Men and Category:Women. These would be subcategories of Category:Person. All people in Wikipedia would be part of either Category:Men or Category:Women. If a user is looking for articles about people, they can simply search within Category:Person to find the information that they are looking for.
Now we must think about data storage. consider the article on Liam Neeson. Do we mark it as Category:Person and Category:Men? Or do we mark the page as just Category:Men. If we store it as both we have redundancy - it can be deduced from the fact that a page is part of Category:Men that it must be part of Category:Person too, since Person is a superset of Men. But for Wikipedia to reason that a Man is a Person takes computation. The fact that a Man is a Person should never change, so we could make the page as Category:Person automatically when a page is marked Category:Men. We are storing more information that is necessary, but future reasoning becomes faster.
Disjointness must also be considered: Category:Men is disjoint from Category:Women. How can we store this information in the database? We may have to add a new tag to category pages:
Category:Men [[DisjointFrom:Woman]]
(A page cannot be a member of both Men and Women - For this sake of argument please ignore trans genders, transsexuals etc, this is just a blackboard example.)
Also, when someone adds the above tag, the page Category:Woman must automatically be marked as [[DisjointFrom:Men]].
What happens when a wikipedia user edits the Liam Neeson page and adds Category:Women? Should wikipedia search what categories have been marked as disjoint (this is computationally expensive operation), and not add not allow the edit?
Also, with the category system we could save Categories as virtual Categories. For example, consider that a VCategory is a virtual category:
VCategory:Irish_Poets would simply be a re-direct to: Category:Irish (intersection) Category:Poets
What do other people think?
Best Regards,
Marc O'Morain