[Wikipedia-l] Categories: An implementation

Daniel Mayer maveric149 at yahoo.com
Sat Jun 14 13:48:32 UTC 2003


>Suggestion: We could easily import the existing
>lists into my implementation.
>
>Magnus

Only the major ones please. Many of the lists in WikiEn are really obscure and 
special purpose. 

IMO there should be very few categories (at least at first); One for each 
major category that is already on, for example, the en.wiki's main page 
(science, mathematics, physics, Linguistics, etc - the whole lot) along with 
type categories like "biography", "city/town/village", "country/nation", 
"subnational entitiy", "stub" etc. 

Keep it simple please (at least at first). If /that/ works then we can slowly 
expand the system in a measured and thoughtful way - voting for new 
categories may be a good solution here in order to prevent a rapid expansion 
of categories that would render the whole system useless. 

I'm very weary of allowing just anybody to create any category since from a 
database management perspective it would be easy for people to create many 
different variants for the same intended category ([[category:biology]], 
[[category:life sciences]], [[category:life science]], [[category:living 
things]], [[category:the study of living things]] etc). And having too many 
categories will be very difficult for people to remember and very unwieldy 
for people to choose from in a category search. Let the lists stay for the 
obscure stuff.

We also need to devise guidelines for when to assign labels. Again from a 
database design perspective: If this is not done consistently then the output 
or any sort will be suspect and perhaps complete garbage. As KQ pointed out 
we could assign the category 'crime' to [[Jesus Christ]] along with the 
category of 'biography' so that JC shows up in a list of criminals. But is JC 
most famous for being a crime figure? No! He is famous for being a religious 
figure so the category 'religion' would be there but not 'crime.' Same for 
[[Bill Clinton]] ([[category:biography]] and [[category:politics]] apply 
there).

With categories in place then one day we could have our special pages use 
these tags for things like Recent Changes. RC in the English Wikipedia often 
has 10 or more edits a minute now! It would be nice to have the option to 
only see articles that may interest the user (well for me that would be 
everything but one of my biology professors may want a recent changes that 
only displays articles relating to mathematics, the sciences and economics).  

This will become absolutely necessary if we continue our quasi-exponential 
growth pattern for WikiEn; one day in not too many years (fewer than most 
people probably think) there will be an average of hundreds of edits a 
minute! We need to sort that out somehow or no human will bother looking at 
RC anymore and will only revert vandalism to articles on their watchlists 
(even with sorting, bots would probably have to help with automatic 'probable 
vandalism/copyvio detection' - outputting results on a separate RC for that 
type of stuff - maybe even by assigning those categories to articles...). 

Alas it looks like Wikipedia is growing up. This time next year WikiEn 
Wikipedians may start to specialize their edits based on which categories 
they choose RC to output and several other language Wikis may follow the year 
after. Our small town is beginning to show signs of emerging cityhood (or has 
WikiEn at least already become a small city?). 

We are going where no wiki has gone before. There is both danger and 
opportunity in that. 

-- Daniel Mayer (aka mav)



More information about the Wikipedia-l mailing list