Magnus Manske wrote:
Filters only work if articles are assigned to categories. Setting aside wether we should use categories in wikipedia itself or only in some sifter project, categories have to be implemented in the software either way. So I hacked a barebone implementation at the test site. A list of current categories can be found at
For a variety of reasons, I've been waiting a few days to jump in on this one. Like Erik I have been arguing for some kind of category system, and have been dismayed when this important initiative gets lost in a straw man argument over censorship or copyrights. Yes, a category system can be used as a tool for such purposes, but it's value for Wikipedia goes well beyond such narrow confines.
Currently, anyone can add and delete categories. I suggest that this will be restricted to sysops later, as it will prevent a "category inflation", as well as malicious deletion of a whole category.
This aspect has already had a lot of response, particularly from those concerned about the openness of Wikipedia. I don't see "category inflation" as a serious problem. Of course if the category list were to remain in its current draft unordered state, its uselessness would increase with its size. In applying restricted access of any kind, system vulnerability, either to malice or to accident, becomes the primary criterion. Even the most trustworthy among us can sometimes fall prey to an "oops" moment. I can see a distinction between integrated and non-integrated categories with only the former being subject to restricted editing, and even integrated categories could have their descriptions fully editable. Altering integrated categories could have unforseen and often far-reaching consequences that may not easily be apparent.
Anyone can assign any article to any category, and remove that assignment as well. Wiki, right? :-)
Of course!
Currently, categories are *not* shown on the article page, although I have written that code (keep getting some weird effect, though, so I turned it off).
To be done:
- Personal category filters
- Search for a category combination (in the example online, "Biology" and "Germany" should list "Anton de Bary")
All in good time.
Thoughts? Comments? Bullets? ;-)
Seeing the jumbled nature of a category list that is only 19 items long has only added to my conviction that we should have a codified hierarchical system.
For a Wikipedia (I have very different approaches for a Wiktionary) I would argue for a system that '''starts''' with Library of Congress Classification System. Before I get back a lot of comments about why it's not a good system, let it be known that I am perfectly aware of its shortcomings, notably its American slant on subject classification, and the fact that its century old structure may not be appropriate to many modern subjects. In its favour is the simple fact that it is there and in the public domain; it is hierarchical and easily subdivided when we require it; it is well known and accessible at its most fundamental structure and that gives us a coherent starting point. An alternative system would be just as good, but it should have these characteristics if we want to avoid a reinventing-the-wheel kind of situation. We need to rember too that whatever system we choose will be modified as we progress.
Some people complained before that they don't like the idea of having to have to learn a long list of different codes. That's fair enough, but it's important to remember that most contributors will work within a limited range of subject areas, and that in itself will limit the codes that they need to remember. Of course coding and classification is an optional task. If a person feels comfortable writing text, but feels lost with codification he can leave that to somebody else even as we aim to make codification as easy as possible.
Any article should be classifiable in several categories. Thus the Anton de Bary article can appear in CT for biography, DD for Germany and QH for biology plus whatever else Wikipedians consider appropriate. Unlike printed books there is no need to limit ourselves to a single classification to enable us to find the book on a shelf somewhere in a library
A category would have three elements: a code, a title, and a description. The codes would be brief and hierarchical; they would also be sufficient as broad search elements. The titles would function in a manner similar to the present article titles. They could appear after a code as a dumb descriptor for that code and linked directly only to the third element. Like most articles these descriptions would be fully editable, and if any edit wars were to arise out of the classification system this is where they would happen.
Magnus's idea of using drop down boxes for putting things into categories should work well with a hierarchical scheme. This could be expanded into a series of nested drop down boxes as required. For the most part LC uses only 2 letters in its classifications, and even then there are many unused classes. Only 3 letters would give us the capacity for 17,576 codes. In the LC system the "E" category is about the United States, and is not normally further developed into lettered sub-classes (though it does have numerical subdivision). We could choose to use "EL" for United States localities, and that would be one of our second level drop down choices. A third level choice might divide these localities by state, but since the drop down list of all states is taller than most people's screens it might be limited to states beginning with each letter, and we could wait until the fourth level to sort out California, Colorado and Connecticut. Georgia, as the only "G" state would have been sufficiently identified at the third level. There you have it -- all the RamBot articles have been classified. :-) There's a great deal of flexibility there.
Eclecticology