[Wikipedia-l] Categories considered harmful

Ray Saintonge saintonge at telus.net
Sat Jun 19 21:45:18 UTC 2004


Jakob wrote:

>Categories considered harmful
>
I generally appreciate your comments, but I would never go so far as to 
consider this initiative harmful.

>Since Version 1.3 of MediaWiki we have the nice category function. In the 
>german wikipedia there is a lot of confusion and struggle on how to use 
>categories in the right way. As a student of library science I could tell 
>several methods how to classify, index and sort things but none of them 
>seems to be applicable easily with the current implementation of categories.
>
The confusion and struggle that you are experiencing on the german 
Wikipedia is being faced by all the projects.  Each is likely to find 
its own way of dealing with the issue, and the solutions are likely to 
show considerable variation.  That's fine and very wiki.  Some will 
undoubtedly be discarded at a later stage, as a part of a normal 
evolutionary process.  We need to avoid being overly critical of those 
who experiment with other possibilities.

I also believe that effective categorization depends on having a 
properly functioning internal search function.  Hopefully, the day will 
come when our developpers will be able to get past the constant stresses 
on the system, and find something that does not depend on Google. :-)

>As far as I can tell there are three main reasons for Wikipedia's success:
>
>1. It's very easy to contribute (Wikitax, everybody can edit)
>2. Every edit is monitored in watchlists and list of lasts edits
>   so we can control each other
>3. There is a clear common mission - to create an encyclopedia (+NPOV)
>
Agreed.

>As far as I also can see the category-function contradicts all of them:
>
To a significant extent, yes.

>1. It's not easy. 
>
>It's not easy to know how to do it in the right way because subject 
>indexing is a complex issue and it's not easy because of lacks in the 
>implementation (no rename, no redirects, no assignment of articles to 
>categories without editing every single the article pages). Editing an 
>article I have to guess which categories are existing, how they are 
>spelled and the rules what to classify into them and what not.
>
Perhaps it's too easy.  Anybody can propose any new category, including 
misspelled ones.  Doing it without creating chaos is a different thing.  
It's especially difficult for people who specialize in a particular area 
of knowledge.  To categorize effectively at its top level requires an 
ability to grasp the "big picture" of the Wikipedia.

>2. It's not controllable.
>
>You cannot watch a category to get noticed on new articles or when 
>somebody removes an article from the category. 
>
Mostly yes.  Building categories is a top-down activity.  Categorizing 
articles is a bottom-up activity.  The challenge lies in establishing an 
interface between these two activities.

>3. There is no common mission
>
>Can anybody tell the purpose of categories? Finding articles (without 
>a coordinated search function?!) Browsing in topics (without a clear
>overview of all categories?!) Are we trying to index articles with 
>subject heading, using a thesaurus, a classification or even a structure 
>ontology? Library science has invented several kind of schemes like that 
>but at the moment everybody is muddling this and that trying to invent 
>the already invented wheels of documentation (by the way there are also
>methods of automatic indexing, clustering and classification).
>
Whatever the mission of categories it is a subordinate mission motivated 
by a desire to make the information in the projects more accessible.  
Categories have no meaning in an information vacuum..  I would answer 
your questions by saying, "All of the above, and more."

Library science has indeed invented numerous schemes.  Any such scheme 
designed for general application is as good as its competitors.  Each 
developped independently to address the priorities of the originating 
library.  Any of them may thus be validly criticized for its nationalist 
tendencies.  Nevertheless, choosing one of them to serve as a starting 
point need not be a nationalist act.  That choice is more likely to be 
driven by the availability of detailed data, and the willingness of some 
individual(s) to do the work of adapting that system to serve wiki purposes.

The muddling and the re-invention of the wheel implicit in most people's 
approach to categorization was completely forseeable.  I say this 
without finding fault.  It was just one of those miseries that had to be 
gone through; system convergence comes later.  Let's just keep away from 
automated system until we know what we want.  A premature application of 
automation will only support the muddling.

>And: In classification there is no NPOV because there is no "right" way 
>to classify the world but it depends on the special needs and questions 
>I want to answer with a special system of subject indexing.
>
I agree with what you seem to be saying but I would not put it in terms 
of NPOV.  "Wiki is not paper," is a far more useful principle.  A 
traditional librarian may want to classify a single copy of a book on 
German libraries, and must decide whether that book should be shelved 
with books about Germany or books about libraries.  We do not have that 
restriction.

>Given the reasons I strongly recommend to stop using the categories and
>to focus on writing and improving good articles. Many categories can easily
>be replaced with normal links between articles. Adding and removing 
>categories do not change an article's content a bit. If you want to 
>keep track of all articles in some area use (Wiki)Projects, article 
>series, portals and learn how to use the "what links here"-function! 
>A good article is an article that can be found easily without categories.
>
I don't arrive at the same conclusion about stopping the use of 
categories.  The techniques that you mention are all good and effective, 
and they should obviously continue to be used.  Categories are a way of 
providing a comprehensive overview.  It is easy to see at an appropriate 
place on an article just how that article has been categorized, or 
indeed '''if''' it has been categorized.  A list system has its uses, 
but needs to be manually maintained.  It is not evident on the face of 
the article that it has been properly listed.  A non-contributing reader 
may not be aware of the list's existence, or of the purpose of "What 
links here."  Even a contributor is not going to be inclined to check 
every article to see if it has been properly listed, but without 
checking he has no way of knowing.  This brings me back to my earlier 
point about categorization of articles being a bottom-up procedure.

>Indeed classifying wikipedia articles is very interesting and will 
>become more important, but this should be an independent project - maybe 
>in a "Classifipedia" or "Categorypedia" that links to wikipedia articles.
>
No.  the categories are meaningless in isolation.

>You know - librarians normally do not write the books they organize and 
>search engine experts do not write the websites they crawl, so let's focus
>on what we can do the best: creating the most detailed, most understandable
>and freest encyclopedia in the history of mankind!
>  
>
Your premise here is the most important thing that you say.  No 
professional librarian would tolerate an author who goes around 
insisting that his books be classified in a particular way.  The 
authors, the editors and the classifiers all have their own roles on the 
wiki.  All are working toward the goal that you specify, but not in 
complete isolation.

Ec




More information about the Wikipedia-l mailing list