Suggestion: We could easily import the existing lists into my implementation.
Magnus
Only the major ones please. Many of the lists in WikiEn are really obscure and special purpose.
IMO there should be very few categories (at least at first); One for each major category that is already on, for example, the en.wiki's main page (science, mathematics, physics, Linguistics, etc - the whole lot) along with type categories like "biography", "city/town/village", "country/nation", "subnational entitiy", "stub" etc.
Keep it simple please (at least at first). If /that/ works then we can slowly expand the system in a measured and thoughtful way - voting for new categories may be a good solution here in order to prevent a rapid expansion of categories that would render the whole system useless.
I'm very weary of allowing just anybody to create any category since from a database management perspective it would be easy for people to create many different variants for the same intended category ([[category:biology]], [[category:life sciences]], [[category:life science]], [[category:living things]], [[category:the study of living things]] etc). And having too many categories will be very difficult for people to remember and very unwieldy for people to choose from in a category search. Let the lists stay for the obscure stuff.
We also need to devise guidelines for when to assign labels. Again from a database design perspective: If this is not done consistently then the output or any sort will be suspect and perhaps complete garbage. As KQ pointed out we could assign the category 'crime' to [[Jesus Christ]] along with the category of 'biography' so that JC shows up in a list of criminals. But is JC most famous for being a crime figure? No! He is famous for being a religious figure so the category 'religion' would be there but not 'crime.' Same for [[Bill Clinton]] ([[category:biography]] and [[category:politics]] apply there).
With categories in place then one day we could have our special pages use these tags for things like Recent Changes. RC in the English Wikipedia often has 10 or more edits a minute now! It would be nice to have the option to only see articles that may interest the user (well for me that would be everything but one of my biology professors may want a recent changes that only displays articles relating to mathematics, the sciences and economics).
This will become absolutely necessary if we continue our quasi-exponential growth pattern for WikiEn; one day in not too many years (fewer than most people probably think) there will be an average of hundreds of edits a minute! We need to sort that out somehow or no human will bother looking at RC anymore and will only revert vandalism to articles on their watchlists (even with sorting, bots would probably have to help with automatic 'probable vandalism/copyvio detection' - outputting results on a separate RC for that type of stuff - maybe even by assigning those categories to articles...).
Alas it looks like Wikipedia is growing up. This time next year WikiEn Wikipedians may start to specialize their edits based on which categories they choose RC to output and several other language Wikis may follow the year after. Our small town is beginning to show signs of emerging cityhood (or has WikiEn at least already become a small city?).
We are going where no wiki has gone before. There is both danger and opportunity in that.
-- Daniel Mayer (aka mav)
Daniel Mayer wrote:
I'm very weary of allowing just anybody to create any category since from a database management perspective it would be easy for people to create many different variants for the same intended category ([[category:biology]], [[category:life sciences]], [[category:life science]], [[category:living things]], [[category:the study of living things]] etc). And having too many categories will be very difficult for people to remember and very unwieldy for people to choose from in a category search. Let the lists stay for the obscure stuff.
Wow! Just like article titles. Maybe we should let only sysops name articles, then everyone else can fill in the information.
But, they might repeat information or organize it badly. Better to let sysops define the structure of the article, then everyone else can fill in the bodies of the paragraphs.
Having category aliases (just like redirects) and letting categories include other categories would make this "problem" moot.
I am very strongly against a category scheme that is limited or sysops-only. I *want* my obscure sub-categories.
-- brion vibber (brion @ pobox.com)
Daniel Mayer wrote:
Suggestion: We could easily import the existing lists into my implementation.
Magnus
Only the major ones please. Many of the lists in WikiEn are really obscure and special purpose.
Agreed. The obscure ones could be fit in at a later stage when we have a better grasp of how things are going.
IMO there should be very few categories (at least at first); One for each major category that is already on, for example, the en.wiki's main page (science, mathematics, physics, Linguistics, etc - the whole lot) along with type categories like "biography", "city/town/village", "country/nation", "subnational entitiy", "stub" etc.
That's consistent with my approach, but not necessarily with the categories mentioned. Top level categories need to be comprehensive to insure that '''every''' article can be placed in at least one of these. If these top level categories are given single letter codes that means a maximum of 26 Some of the more obvious 2nd level codes could be implemented fairly early on.
Keep it simple please (at least at first). If /that/ works then we can slowly expand the system in a measured and thoughtful way - voting for new categories may be a good solution here in order to prevent a rapid expansion of categories that would render the whole system useless.
Good! I have a lot of these to propose, but always following principles of comprehensive and inclusion. This would give a place for all subcategories, and would ensure that even articles which do not yet have a subcategory can be found in the broader category. I would likely propose categories in related batches.
I'm very wary of allowing just anybody to create any category since from a database management perspective it would be easy for people to create many different variants for the same intended category ([[category:biology]], [[category:life sciences]], [[category:life science]], [[category:living things]], [[category:the study of living things]] etc). And having too many categories will be very difficult for people to remember and very unwieldy for people to choose from in a category search. Let the lists stay for the obscure stuff.
These are real risks. There may be a way of dealing with this within the distinction of integrated and non-integrated categories. The latter would be far more flexible in what they would allow, but they could be more easily culled as their uselessness and duplication became apparent.
We also need to devise guidelines for when to assign labels. Again from a database design perspective: If this is not done consistently then the output or any sort will be suspect and perhaps complete garbage. As KQ pointed out we could assign the category 'crime' to [[Jesus Christ]] along with the category of 'biography' so that JC shows up in a list of criminals. But is JC most famous for being a crime figure? No! He is famous for being a religious figure so the category 'religion' would be there but not 'crime.' Same for [[Bill Clinton]] ([[category:biography]] and [[category:politics]] apply there).
I don't think we'll ever ever be able to prevent people from assigning goofy categories to things, any more than we can completely prevent goofy edits and vandalism. Doing this would require becoming a closed projec, and that would defeat Wikipedia's prime directive.
With categories in place then one day we could have our special pages use these tags for things like Recent Changes. RC in the English Wikipedia often has 10 or more edits a minute now! It would be nice to have the option to only see articles that may interest the user (well for me that would be everything but one of my biology professors may want a recent changes that only displays articles relating to mathematics, the sciences and economics).
Certainly, but this addresses how it might be used rather than how it might be set up.
This will become absolutely necessary if we continue our quasi-exponential growth pattern for WikiEn; one day in not too many years (fewer than most people probably think) there will be an average of hundreds of edits a minute! We need to sort that out somehow or no human will bother looking at RC anymore and will only revert vandalism to articles on their watchlists (even with sorting, bots would probably have to help with automatic 'probable vandalism/copyvio detection' - outputting results on a separate RC for that type of stuff - maybe even by assigning those categories to articles...).
Only 867,440 articles to Mega-Wiki! :-) Watching everything is already impossible for a single person.
Alas it looks like Wikipedia is growing up. This time next year WikiEn Wikipedians may start to specialize their edits based on which categories they choose RC to output and several other language Wikis may follow the year after. Our small town is beginning to show signs of emerging cityhood (or has WikiEn at least already become a small city?).
We are going where no wiki has gone before. There is both danger and opportunity in that.
Exactly
wikipedia-l@lists.wikimedia.org