Jakob wrote:
Categories considered harmful
I generally appreciate your comments, but I would never go so far as to consider this initiative harmful.
Since Version 1.3 of MediaWiki we have the nice category function. In the german wikipedia there is a lot of confusion and struggle on how to use categories in the right way. As a student of library science I could tell several methods how to classify, index and sort things but none of them seems to be applicable easily with the current implementation of categories.
The confusion and struggle that you are experiencing on the german Wikipedia is being faced by all the projects. Each is likely to find its own way of dealing with the issue, and the solutions are likely to show considerable variation. That's fine and very wiki. Some will undoubtedly be discarded at a later stage, as a part of a normal evolutionary process. We need to avoid being overly critical of those who experiment with other possibilities.
I also believe that effective categorization depends on having a properly functioning internal search function. Hopefully, the day will come when our developpers will be able to get past the constant stresses on the system, and find something that does not depend on Google. :-)
As far as I can tell there are three main reasons for Wikipedia's success:
- It's very easy to contribute (Wikitax, everybody can edit)
- Every edit is monitored in watchlists and list of lasts edits
so we can control each other 3. There is a clear common mission - to create an encyclopedia (+NPOV)
Agreed.
As far as I also can see the category-function contradicts all of them:
To a significant extent, yes.
- It's not easy.
It's not easy to know how to do it in the right way because subject indexing is a complex issue and it's not easy because of lacks in the implementation (no rename, no redirects, no assignment of articles to categories without editing every single the article pages). Editing an article I have to guess which categories are existing, how they are spelled and the rules what to classify into them and what not.
Perhaps it's too easy. Anybody can propose any new category, including misspelled ones. Doing it without creating chaos is a different thing. It's especially difficult for people who specialize in a particular area of knowledge. To categorize effectively at its top level requires an ability to grasp the "big picture" of the Wikipedia.
- It's not controllable.
You cannot watch a category to get noticed on new articles or when somebody removes an article from the category.
Mostly yes. Building categories is a top-down activity. Categorizing articles is a bottom-up activity. The challenge lies in establishing an interface between these two activities.
- There is no common mission
Can anybody tell the purpose of categories? Finding articles (without a coordinated search function?!) Browsing in topics (without a clear overview of all categories?!) Are we trying to index articles with subject heading, using a thesaurus, a classification or even a structure ontology? Library science has invented several kind of schemes like that but at the moment everybody is muddling this and that trying to invent the already invented wheels of documentation (by the way there are also methods of automatic indexing, clustering and classification).
Whatever the mission of categories it is a subordinate mission motivated by a desire to make the information in the projects more accessible. Categories have no meaning in an information vacuum.. I would answer your questions by saying, "All of the above, and more."
Library science has indeed invented numerous schemes. Any such scheme designed for general application is as good as its competitors. Each developped independently to address the priorities of the originating library. Any of them may thus be validly criticized for its nationalist tendencies. Nevertheless, choosing one of them to serve as a starting point need not be a nationalist act. That choice is more likely to be driven by the availability of detailed data, and the willingness of some individual(s) to do the work of adapting that system to serve wiki purposes.
The muddling and the re-invention of the wheel implicit in most people's approach to categorization was completely forseeable. I say this without finding fault. It was just one of those miseries that had to be gone through; system convergence comes later. Let's just keep away from automated system until we know what we want. A premature application of automation will only support the muddling.
And: In classification there is no NPOV because there is no "right" way to classify the world but it depends on the special needs and questions I want to answer with a special system of subject indexing.
I agree with what you seem to be saying but I would not put it in terms of NPOV. "Wiki is not paper," is a far more useful principle. A traditional librarian may want to classify a single copy of a book on German libraries, and must decide whether that book should be shelved with books about Germany or books about libraries. We do not have that restriction.
Given the reasons I strongly recommend to stop using the categories and to focus on writing and improving good articles. Many categories can easily be replaced with normal links between articles. Adding and removing categories do not change an article's content a bit. If you want to keep track of all articles in some area use (Wiki)Projects, article series, portals and learn how to use the "what links here"-function! A good article is an article that can be found easily without categories.
I don't arrive at the same conclusion about stopping the use of categories. The techniques that you mention are all good and effective, and they should obviously continue to be used. Categories are a way of providing a comprehensive overview. It is easy to see at an appropriate place on an article just how that article has been categorized, or indeed '''if''' it has been categorized. A list system has its uses, but needs to be manually maintained. It is not evident on the face of the article that it has been properly listed. A non-contributing reader may not be aware of the list's existence, or of the purpose of "What links here." Even a contributor is not going to be inclined to check every article to see if it has been properly listed, but without checking he has no way of knowing. This brings me back to my earlier point about categorization of articles being a bottom-up procedure.
Indeed classifying wikipedia articles is very interesting and will become more important, but this should be an independent project - maybe in a "Classifipedia" or "Categorypedia" that links to wikipedia articles.
No. the categories are meaningless in isolation.
You know - librarians normally do not write the books they organize and search engine experts do not write the websites they crawl, so let's focus on what we can do the best: creating the most detailed, most understandable and freest encyclopedia in the history of mankind!
Your premise here is the most important thing that you say. No professional librarian would tolerate an author who goes around insisting that his books be classified in a particular way. The authors, the editors and the classifiers all have their own roles on the wiki. All are working toward the goal that you specify, but not in complete isolation.
Ec