I wanted to remind a neat point of history. Categories have been discussed for a long time, a long time. I think I remember that at some point it was suggested that they could be only created by sysops, and after much concerting, so to avoid anarchic creation of hundreds of categories.
It is funny to remember that :-) I love alive wikipedia, with its anarchic growth :-)))
Maybe I have been out of the loop but just what is the purpose of Categories?
Just now I noticed that someone went through a bunch of pages I was watching and added the cateogry "Opera Composer" which seems entirely logical for the relevant articles, except that we have already have a page [[List of opera composers]]. Does this mean that this article is redundant? My personal instinct, which may or may not be relevant here, is not to do things in two different ways when one suffices. Will people continue to update the handmade lists now that Categories pages are available in such cases of redundancy as this? If not, they should be deleted, as outdated lists are a help to nobody.
V.
Viajero wrote:
Maybe I have been out of the loop but just what is the purpose of Categories?
Just now I noticed that someone went through a bunch of pages I was watching and added the cateogry "Opera Composer" which seems entirely logical for the relevant articles, except that we have already have a page [[List of opera composers]]. Does this mean that this article is redundant? My personal instinct, which may or may not be relevant here, is not to do things in two different ways when one suffices. Will people continue to update the handmade lists now that Categories pages are available in such cases of redundancy as this? If not, they should be deleted, as outdated lists are a help to nobody.
Yes and no. To some extent the manual lists are wish lists with empty links appearing in red. The category system can't pick up list elements that don't exist. Any manual list should probably receive a category tag as well. In a closed end subject or one that is effectively closed-ended (such as the list of US presidents) there are strong arguments from deleting the list when pages exist for all its elements. The list of opera composers, however, is open-ended and should probably remain so that new names can be added, and we can know which have been done and which remain to be done.
Ec
I've always been big fan of lists for many reasons and I thought about this too.
I agree with Ray that one big advantage of lists is that they show red links.
Another advantage is that lists can be formatted and augmented in customized ways, ones that would take a large amount of customized programming to produce. For example, the list of opera composers can not only include uncreated articles, but it can also be hand-tailored to include information like dates of birth, nationality--or perhaps arranged in chronological order, or by nationality, etc. To me this kind of information architecture is large part of the beauty of Wikipeidia.
Many list articles are like a human-made SQL query from the database of Wikipedia that would have required a huge overhead in database fields to create automatically.
Certainly there may be lists that are rendered obsolete by categories, but in the majority of cases, I see categories as complementing lists in most cases, not replacing them.
But this will all shake out. The debate about how lists will work is really over. Whatever was said is water under the bridge Now it's a matter of the collectivity of contributions from editors actually creating categories that will determine how they are really used.
As far as lists being "out of date". Well, that's *always* been an issue with lists and will remain so. I don't see that as a problem, really. We all do our best, to keep things synchronized and up-to-date. Certainly categories will suffer that same thing too, since many newe articles will be created without appropriate category tags.
Ray Saintonge said:
Viajero wrote:
Maybe I have been out of the loop but just what is the purpose of Categories?
Just now I noticed that someone went through a bunch of pages I was watching and added the cateogry "Opera Composer" which seems entirely logical for the relevant articles, except that we have already have a page [[List of opera composers]]. Does this mean that this article is redundant? My personal instinct, which may or may not be relevant here, is not to do things in two different ways when one suffices. Will people continue to update the handmade lists now that Categories pages are available in such cases of redundancy as this? If not, they should be deleted, as outdated lists are a help to nobody.
Yes and no. To some extent the manual lists are wish lists with empty links appearing in red. The category system can't pick up list elements that don't exist. Any manual list should probably receive a category tag as well. In a closed end subject or one that is effectively closed-ended (such as the list of US presidents) there are strong arguments from deleting the list when pages exist for all its elements. The list of opera composers, however, is open-ended and should probably remain so that new names can be added, and we can know which have been done and which remain to be done.
Ec
WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l
At 02:30 PM 5/31/2004 -0400, Matthew Trump wrote:
As far as lists being "out of date". Well, that's *always* been an issue with lists and will remain so. I don't see that as a problem, really. We all do our best, to keep things synchronized and up-to-date. Certainly categories will suffer that same thing too, since many newe articles will be created without appropriate category tags.
I think that categories will likely be quite helpful for keeping lists up to date, actually. A little while back, for example, I was doing some work on trying to get [[Links to disambiguating pages]] cleaned up; adding disambiguation pages that had been missed, culling out pages that had once upon a time been disambiguating pages but weren't any longer, etc. It was a lot of hard labor and required some custom database searches, and was only possible because disambiguation pages already had a pseudo-category tag in the form of {{msg:disambig}}. Most other lists wouldn't have something like that, so a "what links here" query would have been full of noise.
Since category listings are generated "bottom up", on the other hand, they immediately respond to alterations made out "in the field". If a page gets moved or disambiguated, or its contents change drastically enough to be recategorized, it's immediately updated in the category list. Should be quite handy, IMO.
Good question. I would NOT delete any of the handmade lists for one simple reason: They have slowly grown and have had many intelligent contributors, whereas now people who are not necessarily experts in all the various fields (how could they) are in a hurry (you tell me why) compiling lists of all sorts.
A few hours I mildly protested against one of those categorizations. See [[George Ritzer]], pigeonholed as a "writer".
KF
----- Original Message ----- From: "Viajero" viajero@quilombo.nl To: "English Wikipedia" wikien-l@Wikipedia.org Sent: Monday, May 31, 2004 11:40 AM Subject: [WikiEN-l] what's the deal with categories?
Maybe I have been out of the loop but just what is the purpose of
Categories?
Just now I noticed that someone went through a bunch of pages I was
watching and added the cateogry "Opera Composer" which seems entirely logical for the relevant articles, except that we have already have a page [[List of opera composers]]. Does this mean that this article is redundant? My personal instinct, which may or may not be relevant here, is not to do things in two different ways when one suffices. Will people continue to update the handmade lists now that Categories pages are available in such cases of redundancy as this? If not, they should be deleted, as outdated lists are a help to nobody.
V. _______________________________________________ WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l
K Forstner wrote:
A few hours I mildly protested against one of those categorizations. See [[George Ritzer]], pigeonholed as a "writer".
Hang on, he hasn't been "pigeonholed". He has been noted as being a writer, which seems fair enough - his article says "He has published extensively", and more than half of it is taken up with a list of books he has written. Just because he is in category "Writers" doesn't mean he can't also be in categories "Sociologists" and "Professors"!
Cheers! David...
Timwi wrote:
K Forstner wrote:
A few hours I mildly protested against one of those categorizations. See [[George Ritzer]], pigeonholed as a "writer".
Interestingly, though, you haven't changed it either.
It's clear from the article that he has been a writer. If you want to add further categories, do go ahead.
Ec
----- Original Message ----- From: "Ray Saintonge" saintonge@telus.net To: "English Wikipedia" wikien-l@Wikipedia.org Sent: Tuesday, June 01, 2004 4:46 PM Subject: Re: [WikiEN-l] Re: what's the deal with categories?
Yesterday I wrote:
A few hours ago I mildly protested against one of those categorizations. See [[George Ritzer]], pigeonholed as a "writer".
Interestingly, though, you haven't changed it either.
It's clear from the article that he has been a writer. If you want to add further categories, do go ahead.
Ec
And now I'd like to add:
This is surprisingly inconsistent. What exactly in the [[George Ritzer]] article makes it clear that he has been a "writer"? As I tried to point out on the talk page, well of course he has "written" something, but nothing even remotely comparable to what the other categorized "writers" have written, namely fiction.
Also, why should I change anything? I may be wrong. What's more, I'm not at all familiar with those new categorizations, I don't think they have been properly discussed, so why should I go ahead and contribute to something I'm not convinced of?
KF
K Forstner wrote:
From: "Ray Saintonge" saintonge@telus.net
Yesterday I wrote:
A few hours ago I mildly protested against one of those categorizations. See [[George Ritzer]], pigeonholed as a "writer".
Interestingly, though, you haven't changed it either.
It's clear from the article that he has been a writer. If you want to add further categories, do go ahead.
Ec
This is surprisingly inconsistent. What exactly in the [[George Ritzer]] article makes it clear that he has been a "writer"? As I tried to point out on the talk page, well of course he has "written" something, but nothing even remotely comparable to what the other categorized "writers" have written, namely fiction.
Also, why should I change anything? I may be wrong. What's more, I'm not at all familiar with those new categorizations, I don't think they have been properly discussed, so why should I go ahead and contribute to something I'm not convinced of?
I have no problem agreein that to say that Ritzer is a writer is a remarkably shallow observation. But for this epistulary fugue I might never have heard of Ritzer, so I am ill-equipped to participate in his categorization.
The year-long discussion that preceded was focused on whether there should be categories at all, and how such an idea might be technically implemented. The discussion of specific categories was negligible. Details of the scheme(s) have yet to mature.
Ec
K Forstner wrote:
Also, why should I change anything? I may be wrong.
Loads of people change loads of things all the time, and many of them are wrong. Someone else fixes the wrongs over time. That's how Wikipedia works.
What's more, I'm not at all familiar with those new categorizations, I don't think they have been properly discussed, so why should I go ahead and contribute to something I'm not convinced of?
Because that's the best way to convince yourself of something (whether it be "categories are great" or "categories are crap").
Timwi
----- Original Message ----- From: "Timwi" timwi@gmx.net To: wikien-l@wikipedia.org Sent: Wednesday, June 02, 2004 1:46 AM Subject: [WikiEN-l] Re: What's the deal with categories?
K Forstner wrote:
Also, why should I change anything? I may be wrong.
Timwi's answer: Loads of people change loads of things all the time, and
many of them
are wrong. Someone else fixes the wrongs over time. That's how Wikipedia works.
And let me add now once more: No, that's NOT how Wikipedia works for me. Anyone may be unintentionally wrong, but if I'm not sure to begin with, this is certainly NOT a good prerequisite for changing an article, even if it's a minor thing. It's just a lot more work for the rest of us correcting the mistake(s) again.
What's more, I'm not at all familiar with those new categorizations, I don't think they have
been
properly discussed, so why should I go ahead and contribute to something
I'm
not convinced of?
Timwi's answer: Because that's the best way to convince yourself of
something (whether
it be "categories are great" or "categories are crap").
And again I can't say I understand this. You know, I've got used to waiting and seeing what will happen. It seems anyone (a single contributor) can create a new category, merge existing ones, rename them, whatever. Soon the same questions, those we had when lots of lists where being created, will crop up: Where do we draw the line? Do we need a category "People whose left index finger is crippled"? etc. etc.
By the way, I'm not a cunctator. I'm certainly NOT reluctant to contribute something to Wikipedia, including making major changes. But as I said, I have to be convinced BEFOREHAND that a particular change is a good thing.
All the best,
KF
K Forstner wrote:
It seems anyone (a single contributor) can create a new category, merge existing ones, rename them, whatever.
Merging and renaming large categories involves a large number of edits (one per article in the category/-ies). Few intelligent people will go for that trouble if they think it will be reverted anyway.
In contrast, you have complained about the categorisation of only one article. Changing that category tag back and forth is completely trivial.
Timwi
Thanks to various responses to my earlier post on the subject, I have come to appreciate some of the possibilities of Categories, at least in theory. So, this morning I decided to categorize the forty or so articles on writers which I created or to which I made major contributions, having seen several suitable Categories show up in my Watchlist for similar articles. To start with, I wanted to know what categories have already been created for writers and journalists. For example, I saw a category for for "Argentine writers" (the article on Borges I think). Does this mean I should also create a category for Uruguayan, Mexican, and Chilean writers (I wrote up a couple) even though such categories would have one or two entries? However, there doesn't seem to be any comprehensive, hierarchical index of categories. [[Wikipedia:Categorization]] should serve this function, but it is woefully incomplete. [[Category:Main page]] is even worse. So, I turned to
http://en.wikipedia.org/w/wiki.phtml?title=Special:Categories&article=Li...
which is an alphabetical list of categories, but the first five hundred only range from ".hack" to "British cheeses", suggesting that already on the order of eight to ten thousand categories (!!!) have been created. Obviously this page huge can't be used for looking up existing categories, if only because it would place a tremendous burden on the servers. A random selection from just the first 500:
* 1983 albums * 24-hour television news channels * AHL Trophies and Awards * Aerosmith albums * Airports of the United Arab Emirates * Art galleries and museums in Ohio * Belgian cuisine * Boston Bruins players
What can possibly be the use of such narrow categories with only a handful of entries? Shouldn't we be aiming for broad categories (ie, albums, tv stations, awards, airports, museums, cuisines, atheletes)?
In any case, my only option for writers appears to be to add categories on an ad hoc basis, and that exactly is what seems to be happening across the entire encyclopedia. Turning to opera, for example, a topic which I have worked a lot on, I see on [[Category:Opera_singers]] that the following categories have been created:
Category:Baritone opera singers, Category:Bass opera singers, Category:Contraltos (opera singers), Category:Mezzo-sopranos (opera singers), Category:Sopranos (opera singers), Category:Tenors (opera singers)
They aren't even consistently labelled! It is entirely possible, perhaps probable that someone will come along and create "Category:Contraltos opera singers" "Category:Tenors". Each has only one entry (tenor has two), indicating that they were created on a strictly ad hoc basis with no effort made to track down all the articles for a given category.
I realize Categories are very new and the system will evolve. I am also a big fan of Wikipedia's self-organizing characteristics and I don't want to sound like a control freak. However, it seems to me that Categories will only be useful if the system is implemented in a thoughtful way. At the very least, it should be compulsory to add new categories to [[Wikipedia:Categorization]] or [[Category:Main page]], so that they can be referenced and evaluated by other editors. Better yet would be to have some kind of vetting procedure for introducing new categories. But given the number of categories now in use, it seems that the genie is out of the bottle and that this is all just wishful thinking.
V.
what about selecting a topic you know well (perhaps Opera ? or a larger Music group), gather around you a bunch of editors you know to be reliable editors on that topic, then make a working group to identify which categories should exist, to which extent, under which naming scheme... Then, when you all agree together, write down for example all the categories which are likely to cover the Opera topic, then make a list of all articles which should belong on which category, have a bot do the categories for these articles, and remove all categories which do not fit the scheme.
Viajero wrote:
Thanks to various responses to my earlier post on the subject, I have come to appreciate some of the possibilities of Categories, at least in theory. So, this morning I decided to categorize the forty or so articles on writers which I created or to which I made major contributions, having seen several suitable Categories show up in my Watchlist for similar articles. To start with, I wanted to know what categories have already been created for writers and journalists. For example, I saw a category for for "Argentine writers" (the article on Borges I think). Does this mean I should also create a category for Uruguayan, Mexican, and Chilean writers (I wrote up a couple) even though such categories would have one or two entries? However, there doesn't seem to be any comprehensive, hierarchical index of categories. [[Wikipedia:Categorization]] should serve this function, but it is woefully incomplete. [[Category:Main page]] is even worse. So, I turned to
http://en.wikipedia.org/w/wiki.phtml?title=Special:Categories&article=Li...
which is an alphabetical list of categories, but the first five hundred only range from ".hack" to "British cheeses", suggesting that already on the order of eight to ten thousand categories (!!!) have been created. Obviously this page huge can't be used for looking up existing categories, if only because it would place a tremendous burden on the servers. A random selection from just the first 500:
- 1983 albums
- 24-hour television news channels
- AHL Trophies and Awards
- Aerosmith albums
- Airports of the United Arab Emirates
- Art galleries and museums in Ohio
- Belgian cuisine
- Boston Bruins players
What can possibly be the use of such narrow categories with only a handful of entries? Shouldn't we be aiming for broad categories (ie, albums, tv stations, awards, airports, museums, cuisines, atheletes)?
In any case, my only option for writers appears to be to add categories on an ad hoc basis, and that exactly is what seems to be happening across the entire encyclopedia. Turning to opera, for example, a topic which I have worked a lot on, I see on [[Category:Opera_singers]] that the following categories have been created:
Category:Baritone opera singers, Category:Bass opera singers, Category:Contraltos (opera singers), Category:Mezzo-sopranos (opera singers), Category:Sopranos (opera singers), Category:Tenors (opera singers)
They aren't even consistently labelled! It is entirely possible, perhaps probable that someone will come along and create "Category:Contraltos opera singers" "Category:Tenors". Each has only one entry (tenor has two), indicating that they were created on a strictly ad hoc basis with no effort made to track down all the articles for a given category.
I realize Categories are very new and the system will evolve. I am also a big fan of Wikipedia's self-organizing characteristics and I don't want to sound like a control freak. However, it seems to me that Categories will only be useful if the system is implemented in a thoughtful way. At the very least, it should be compulsory to add new categories to [[Wikipedia:Categorization]] or [[Category:Main page]], so that they can be referenced and evaluated by other editors. Better yet would be to have some kind of vetting procedure for introducing new categories. But given the number of categories now in use, it seems that the genie is out of the bottle and that this is all just wishful thinking.
V.
Yes the bot idea or some variation sounds extremely attractive.
During this shakedown cruise of categories, the main thing I've noticed is that they are changing very rapidly. A category is created, and articles are added to it. Then someone else creates more specific sub-categories and begins shifting articles from the higher level to the lower level category. Or the category name is changed to reflect standards of capitalization or formatting (as might happen the opera singer example)
In any case, this is resulting in many editors going through many articles repeated to change the categorization. My watchlist is filled with long repeated edits in multiple waves.
The root of this is that unlike lists, which can be changed "all at once" by moving a page, categories have to tweaked by hand in every single article in a particular category. It is the flip side of the auto-generated nature of categories.
I agree with Anthere that some kind of bot system or automated update of categories would be a very good thing if could alleviate this problem. The current system is not only time-consuming but will inevitably generate anger among editors as they see their hard work of editing many articles swept away. Unlike page moves, this cannot be undone by a simple procedure but requires many edits and hours of work to undo under the current system.
Anthere said:
what about selecting a topic you know well (perhaps Opera ? or a larger Music group), gather around you a bunch of editors you know to be reliable editors on that topic, then make a working group to identify which categories should exist, to which extent, under which naming scheme... Then, when you all agree together, write down for example all the categories which are likely to cover the Opera topic, then make a list of all articles which should belong on which category, have a bot do the categories for these articles, and remove all categories which do not fit the scheme.
Viajero wrote:
Thanks to various responses to my earlier post on the subject, I have come to appreciate some of the possibilities of Categories, at least in theory. So, this morning I decided to categorize the forty or so articles on writers which I created or to which I made major contributions, having seen several suitable Categories show up in my Watchlist for similar articles. To start with, I wanted to know what categories have already been created for writers and journalists. For example, I saw a category for for "Argentine writers" (the article on Borges I think). Does this mean I should also create a category for Uruguayan, Mexican, and Chilean writers (I wrote up a couple) even though such categories would have one or two entries? However, there doesn't seem to be any comprehensive, hierarchical index of categories. [[Wikipedia:Categorization]] should serve this function, but it is woefully incomplete. [[Category:Main page]] is even worse. So, I turned to
http://en.wikipedia.org/w/wiki.phtml?title=Special:Categories&article=Li...
which is an alphabetical list of categories, but the first five hundred only range from ".hack" to "British cheeses", suggesting that already on the order of eight to ten thousand categories (!!!) have been created. Obviously this page huge can't be used for looking up existing categories, if only because it would place a tremendous burden on the servers. A random selection from just the first 500:
- 1983 albums
- 24-hour television news channels
- AHL Trophies and Awards
- Aerosmith albums
- Airports of the United Arab Emirates
- Art galleries and museums in Ohio
- Belgian cuisine
- Boston Bruins players
What can possibly be the use of such narrow categories with only a handful of entries? Shouldn't we be aiming for broad categories (ie, albums, tv stations, awards, airports, museums, cuisines, atheletes)?
In any case, my only option for writers appears to be to add categories on an ad hoc basis, and that exactly is what seems to be happening across the entire encyclopedia. Turning to opera, for example, a topic which I have worked a lot on, I see on [[Category:Opera_singers]] that the following categories have been created:
Category:Baritone opera singers, Category:Bass opera singers, Category:Contraltos (opera singers), Category:Mezzo-sopranos (opera singers), Category:Sopranos (opera singers), Category:Tenors (opera singers)
They aren't even consistently labelled! It is entirely possible, perhaps probable that someone will come along and create "Category:Contraltos opera singers" "Category:Tenors". Each has only one entry (tenor has two), indicating that they were created on a strictly ad hoc basis with no effort made to track down all the articles for a given category.
I realize Categories are very new and the system will evolve. I am also a big fan of Wikipedia's self-organizing characteristics and I don't want to sound like a control freak. However, it seems to me that Categories will only be useful if the system is implemented in a thoughtful way. At the very least, it should be compulsory to add new categories to [[Wikipedia:Categorization]] or [[Category:Main page]], so that they can be referenced and evaluated by other editors. Better yet would be to have some kind of vetting procedure for introducing new categories. But given the number of categories now in use, it seems that the genie is out of the bottle and that this is all just wishful thinking.
V.
WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l
I'm again intrigued by the discussions here, be it categorizations or a user I seem to have never come across called 172 (?). I came back from a weekend a couple of hours ago and since then have been trying to access Wikipedia -- to no avail.
As far as categories are concerned, I tried to point out some days ago that there should be some policy everyone could agree on, but Timwi suggested we should just go ahead. The result of this approach is being deplored already.
I wonder though if a bot is the right answer.
KF
----- Original Message ----- From: "Matthew Trump" wikipedia@decumanus.com To: anthere9@yahoo.com; "English Wikipedia" wikien-l@Wikipedia.org Sent: Sunday, June 06, 2004 6:54 PM Subject: Re: [WikiEN-l] Re: A first encounter with Categories
Yes the bot idea or some variation sounds extremely attractive.
During this shakedown cruise of categories, the main thing I've noticed is that they are changing very rapidly. A category is created, and articles are added to it. Then someone else creates more specific sub-categories and begins shifting articles from the higher level to the lower level category. Or the category name is changed to reflect standards of capitalization or formatting (as might happen the opera singer example)
In any case, this is resulting in many editors going through many articles repeated to change the categorization. My watchlist is filled with long repeated edits in multiple waves.
The root of this is that unlike lists, which can be changed "all at once" by moving a page, categories have to tweaked by hand in every single article in a particular category. It is the flip side of the auto-generated nature of categories.
I agree with Anthere that some kind of bot system or automated update of categories would be a very good thing if could alleviate this problem. The current system is not only time-consuming but will inevitably generate anger among editors as they see their hard work of editing many articles swept away. Unlike page moves, this cannot be undone by a simple procedure but requires many edits and hours of work to undo under the current system.
K Forstner wrote:
I'm again intrigued by the discussions here, be it categorizations or a user I seem to have never come across called 172 (?). I came back from a weekend a couple of hours ago and since then have been trying to access Wikipedia -- to no avail.
As far as categories are concerned, I tried to point out some days ago that there should be some policy everyone could agree on, but Timwi suggested we should just go ahead. The result of this approach is being deplored already.
I wonder though if a bot is the right answer.
KF
Well, I regret that all what people understood of my proposal was the final comment on bots. It was not a bot proposal I was suggesting, but a team working proposal, something a bit like a wikiproject, in order to reduce current anarchy on category organisation.
No one commented on anything, but on this bot issue; so I gather my suggestion was not interesting. Never mind :-)
That part was too obviously a good idea I think. :)
Anthere said:
K Forstner wrote:
I'm again intrigued by the discussions here, be it categorizations or a user I seem to have never come across called 172 (?). I came back from a weekend a couple of hours ago and since then have been trying to access Wikipedia -- to no avail.
As far as categories are concerned, I tried to point out some days ago that there should be some policy everyone could agree on, but Timwi suggested we should just go ahead. The result of this approach is being deplored already.
I wonder though if a bot is the right answer.
KF
Well, I regret that all what people understood of my proposal was the final comment on bots. It was not a bot proposal I was suggesting, but a team working proposal, something a bit like a wikiproject, in order to reduce current anarchy on category organisation.
No one commented on anything, but on this bot issue; so I gather my suggestion was not interesting. Never mind :-)
WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l
As far as the bot thing goes, I meant it in the context of what you proposed--something can be invoked when there is agreement that categories should be altered according to some fashion. My comment before--I think it's very much a good idea, and nearly self-evident, that categories should ultimately be a group effort in some semi-organized fashion as you described.
But I suppose right now really the only thing to do is sit back and let the experiment with categories play out, with all its messiness. Certainly everything is getting screwed up in a big way with categories very quickly, but I suppose this is the only way to see what should be done.
Anthere said:
K Forstner wrote:
I'm again intrigued by the discussions here, be it categorizations or a user I seem to have never come across called 172 (?). I came back from a weekend a couple of hours ago and since then have been trying to access Wikipedia -- to no avail.
As far as categories are concerned, I tried to point out some days ago that there should be some policy everyone could agree on, but Timwi suggested we should just go ahead. The result of this approach is being deplored already.
I wonder though if a bot is the right answer.
KF
Well, I regret that all what people understood of my proposal was the final comment on bots. It was not a bot proposal I was suggesting, but a team working proposal, something a bit like a wikiproject, in order to reduce current anarchy on category organisation.
No one commented on anything, but on this bot issue; so I gather my suggestion was not interesting. Never mind :-)
WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l
On 06/07/04 at 12:57 AM, Anthere anthere9@yahoo.com said:
Well, I regret that all what people understood of my proposal was the final comment on bots. It was not a bot proposal I was suggesting, but a team working proposal, something a bit like a wikiproject, in order to reduce current anarchy on category organisation.
No one commented on anything, but on this bot issue; so I gather my suggestion was not interesting. Never mind :-)
Actually, it was a good suggestion. I left a message on Camembert's talk page yesterday asking him if he wanted to coordinate with classical music categories with me and a similar message on another user's page who is interested in Latin American topics. Alas, this was shortly before the database server went belly-up...
V.
Viajero wrote:
Actually, it was a good suggestion. I left a message on Camembert's talk page yesterday asking him if he wanted to coordinate with classical music categories with me and a similar message on another user's page who is interested in Latin American topics. Alas, this was shortly before the database server went belly-up...
V.
I am glad then :-) Perhaps you could at the same time define which areas are not yet covered and really should be ?
ant
Anthere,
I'm sorry for any/all misunderstandings. You know, I'm not against Rambots updating information on obscure American towns, but the bots whose aim it was to de-link disambiguation pages caused some silly lines on a number of pages, especially in a linguistic context (which is not that rare) when you want to point exactly to the fact that a word has several meanings.
Anyway, I'm clearly out of touch with the rest of the discussion. No one ever reacts to my contributions (you are the exception here), so I seem to have no idea what Wikipedia discussions are currently all about. I mean I can't even access it, but everyone seems to know everything about it except me.
I'll have to wait till Wikipedia is online again to see the status quo on categorizations.
All the best,
KF
----- Original Message ----- From: "Anthere" anthere9@yahoo.com To: wikien-l@wikipedia.org Sent: Monday, June 07, 2004 12:57 AM Subject: [WikiEN-l] Re: A first encounter with Categories
K Forstner wrote:
I'm again intrigued by the discussions here, be it categorizations or a
user
I seem to have never come across called 172 (?). I came back from a
weekend
a couple of hours ago and since then have been trying to access
Wikipedia --
to no avail.
As far as categories are concerned, I tried to point out some days ago
that
there should be some policy everyone could agree on, but Timwi suggested
we
should just go ahead. The result of this approach is being deplored
already.
I wonder though if a bot is the right answer.
KF
Well, I regret that all what people understood of my proposal was the final comment on bots. It was not a bot proposal I was suggesting, but a team working proposal, something a bit like a wikiproject, in order to reduce current anarchy on category organisation.
No one commented on anything, but on this bot issue; so I gather my suggestion was not interesting. Never mind :-)
Well, errr, don't worry. Things will sort out on time :-) You are correct that bots may cause problems...
I fear I disagree with you when you say "everyone knows everything about it but me". Categories seem a bit anarchic right now (well, last time I succeeded to edit was 2 days ago...) and it seems functionalities bits were forgotten in software update. No one knows everything. Ever :-)
I'd like to support a comment made previously. It has to do with categories and meta tagging.
Proper meta tagging will allow our content to be neatly organised and easy to search. Proper meta tagging implies neat categories, which may be crossed by other neat categories. Internet is a huge mess of data, most of them are very poorly referenced. We'll become a serious reference if our content is cleanly tagged, hence searchable with complex queries.
[[list of museums in Australia with a chinese flag hanging above the entrance door]] is of no use.
Search "[[museum]]" where "[[country]] is 'australia'" and "[[topic]] is 'China'" is useful.
To be useful, meta-tags must be large rather than very specific. Query gives the specificity.
K Forstner wrote:
Anthere,
I'm sorry for any/all misunderstandings. You know, I'm not against Rambots updating information on obscure American towns, but the bots whose aim it was to de-link disambiguation pages caused some silly lines on a number of pages, especially in a linguistic context (which is not that rare) when you want to point exactly to the fact that a word has several meanings.
Anyway, I'm clearly out of touch with the rest of the discussion. No one ever reacts to my contributions (you are the exception here), so I seem to have no idea what Wikipedia discussions are currently all about. I mean I can't even access it, but everyone seems to know everything about it except me.
I'll have to wait till Wikipedia is online again to see the status quo on categorizations.
All the best,
KF
At 03:22 AM 6/8/2004 +0200, Anthere wrote:
[[list of museums in Australia with a chinese flag hanging above the entrance door]] is of no use.
Search "[[museum]]" where "[[country]] is 'australia'" and "[[topic]] is 'China'" is useful.
To be useful, meta-tags must be large rather than very specific. Query gives the specificity.
But thanks to subcategorization, they can be _both_. [[Museums in ohio]] can be a subcategory of [[Museums]] along with all the other [[Museums in <location>]] subcategories, and then all you need to do is throw in a mechanism for referring to all the articles in a category tree and you can treat [[Museums]] as if it contained all museum articles everywhere. If you're already proposing a query language for manipulating categories this seems a fairly trivial extension to it to me.
I've been wavering a bit, but I think I'm coming down pretty solidly in the fine-grained categorization camp. The other day I was working on moving the articles in the [[North American Rivers]] category to the [[North American rivers]] category (note capitalization), and started putting the articles into [[Idaho rivers]], [[Alberta rivers]], etc subcategories instead because the list was so big and generic otherwise. I then found it trivially easy to stick those subcategories under the relevant geographic categories as well ([[Idaho]], [[Alberta]]), and it occurred to me that it would be fairly easy once I was done to set up [[Mississippi watershed]], [[Colorado watershed]], etc. since in many cases whole entire states drain into those and I could just drop their subcategories in place to include all of their rivers. It would be a whole lot more work if I had to go around to each of those articles and add a new category to every river in North America.
IMO, fine-grained categories provide a convenient "handle" by which groups of articles can be organized in various useful ways, without requiring editors to learn how to deal with query languages doing complicated or mysterious stuff "behind the scenes" or adding lots of high-level category tags to individual articles.
On 06/08/04 at 03:22 AM, Anthere anthere9@yahoo.com said:
Proper meta tagging will allow our content to be neatly organised and easy to search. Proper meta tagging implies neat categories, which may be crossed by other neat categories. Internet is a huge mess of data, most of them are very poorly referenced. We'll become a serious reference if our content is cleanly tagged, hence searchable with complex queries.
Very well stated.
[[list of museums in Australia with a chinese flag hanging above the entrance door]] is of no use.
Search "[[museum]]" where "[[country]] is 'australia'" and "[[topic]] is 'China'" is useful.
To be useful, meta-tags must be large rather than very specific. Query gives the specificity.
My intuition as well. Since we also have full-text searching, one can easily search for "museum" "Australia" "chinese flag" or any such narrow target.
V.
On Monday 07 June 2004 21:44, K Forstner wrote:
I'm sorry for any/all misunderstandings. You know, I'm not against Rambots updating information on obscure American towns, but the bots whose aim it was to de-link disambiguation pages caused some silly lines on a number of pages, especially in a linguistic context (which is not that rare) when you want to point exactly to the fact that a word has several meanings.
That also can happen with people, not only with bots. When it does, I reambiguate with comment <!-- don't disambiguate because ... -->
Matthew Trump wrote:
In any case, this is resulting in many editors going through many articles repeated to change the categorization. My watchlist is filled with long repeated edits in multiple waves.
I really don't see anything wrong with that. Some people just act a bit too quickly, and add articles to too broad categories. (I did that too at first.) But it gives people something to work with, and splitting them up into smaller categories is less work than creating smaller categories from scratch.
This is the Wiki process. Someone makes a sub-optimal contribution and others improve on it. Yes, sometimes it's a bit of work to clean up after *really* sub-optimal contributions, but a good end-result will eventually precipitate out. C'est la vie-kie.
Timwi
Yes, I'm aware that this is the wiki process :). My point is that it's a very labor intensive one that cannot be undone very easily, unlike an edit rollback or a page move. My point is not that things change, but rather the one similar to yours in that in the long run, there should optimally be some kind of automated updating of categories if an entire category is moved or redefined.
Timwi said:
Matthew Trump wrote:
In any case, this is resulting in many editors going through many articles repeated to change the categorization. My watchlist is filled with long repeated edits in multiple waves.
I really don't see anything wrong with that. Some people just act a bit too quickly, and add articles to too broad categories. (I did that too at first.) But it gives people something to work with, and splitting them up into smaller categories is less work than creating smaller categories from scratch.
This is the Wiki process. Someone makes a sub-optimal contribution and others improve on it. Yes, sometimes it's a bit of work to clean up after *really* sub-optimal contributions, but a good end-result will eventually precipitate out. C'est la vie-kie.
Timwi
WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l
Viajero wrote:
Thanks to various responses to my earlier post on the subject, I have come to appreciate some of the possibilities of Categories, at least in theory. So, this morning I decided to categorize the forty or so articles on writers which I created or to which I made major contributions, having seen several suitable Categories show up in my Watchlist for similar articles. To start with, I wanted to know what categories have already been created for writers and journalists. For example, I saw a category for for "Argentine writers" (the article on Borges I think). Does this mean I should also create a category for Uruguayan, Mexican, and Chilean writers (I wrote up a couple) even though such categories would have one or two entries? However, there doesn't seem to be any comprehensive, hierarchical index of categories. [[Wikipedia:Categorization]] should serve this function, but it is woefully incomplete.
Having had a little outing with categories, with mixed satisfaction, I'm more interested in going back and working on some topdown organization. As you probably saw from some of the people fooling with diagrams, there are a couple of different approaches. One is to create categories for every conjunction of categories - if you have "hotels" and "museums" and "Ohio", that means you should have "hotels in Ohio" and "museums in Ohio" categories. The other approach is to have an article in multiple categories and do nothing else, but then the software currently gives you no easy way to list only the museums in Ohio.
I personally tend to favor the second approach, because the conjunction categories will combinatorially explode and eventually outnumber articles. It seems more useful to add a way for the category page to optionally group members by a second category or some such, so you can go to the museums category and say "organize by location categories" and have it sort and group by country/state/city.
In any case, I think category creation will settle down once there are about as many categories as there are lists now.
Stan
Stan Shebs wrote:
there are a couple of different approaches. One is to create categories for every conjunction of categories - if you have "hotels" and "museums" and "Ohio", that means you should have "hotels in Ohio" and "museums in Ohio" categories.
I favour this approach. Without it, [[Category:Museums]] would contain *all* museums in the world (millions!), and [[Category:Ohio]] would contain absolutely *everything* about Ohio (cities, museums, exhibitions, operas, theatres, cinemas, railways, bus lines, sightseeing, TV stations, and all sorts of other unrelated things).
The other approach is to have an article in multiple categories and do nothing else, but then the software currently gives you no easy way to list only the museums in Ohio.
And even if it did, my above comment still stands -- I think the only articles [[Category:Museum]] should directly contain are those that don't fit in any of its sub-categories (e.g. [[museum]]).
I personally tend to favor the second approach, because the conjunction categories will combinatorially explode and eventually outnumber articles.
How did you come to this conclusion? With the conjunction categories, the vast majority of articles will be in only one category. But even if your calculation was plausible, what's wrong with having loads of categories (which are just a bit of meta stuff) and a slightly smaller number of articles (all of which, however, will be loaded with useful information)?
It seems more useful to add a way for the category page to optionally group members by a second category or some such, so you can go to the museums category and say "organize by location categories" and have it sort and group by country/state/city.
Any scheme like that will always be limited in an annoying way. Imagine the particular location you want isn't in it. It would take developer effort to add it. Or imagine you want to combine two things, both of which aren't a location.
In any case, I think category creation will settle down once there are about as many categories as there are lists now.
Has there been an edit war about categories yet, at all? It seems that a lot of people are grumpy that categorisations of particular articles get changed multiple times, but as I said before, I don't see anything wrong with that, especially when it settles down to something that nobody insists on reverting.
Timwi
--- Timwi timwi@gmx.net wrote:
Stan Shebs wrote:
there are a couple of different approaches. One is to create categories for every conjunction of categories - if you have "hotels" and "museums" and "Ohio", that means you should have "hotels in Ohio" and "museums in Ohio" categories.
I favour this approach. Without it, [[Category:Museums]] would contain *all* museums in the world (millions!), and [[Category:Ohio]] would contain absolutely *everything* about Ohio (cities, museums, exhibitions, operas, theatres, cinemas, railways, bus lines, sightseeing, TV stations, and all sorts of other unrelated things).
That is simply bad database design. A user should be able to select [Museums] WITHIN [Ohio] or just [Museums] or even just [Ohio]. Otherwise different tags will be needed for all possible combinations of museums, sport stadiums, or amusement parks in every single area of the world. It would be much better to have tags for the places and tags for things. Let the user decide what combination of variables he or she wants to select. That will require an advanced search capability to be added to MediaWiki, but I assume that was planned anyway.
-- Daniel Mayer (aka mav)
__________________________________ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/
I completely agree. I think the main thrust of categories right now is that perhaps they are becomng very quickly fine-grained (Fine grained categories are just more interesting perhaps).
Ideally, it would have best to have only coarse-grained categories at first (using some kind of community process like Anthere described), creating finer-grained ones only later.
I realize that this is hindsight, but I think in the long run something like you propose would be very good and may eventually result in the fine-grained categories being replaced by more convenient coarse-grained ones. It would definitely make searching much more easy, if you could choose from "Ohio" and "museums" rather than having to search a much longer list for "museums of Ohio" or "Ohio museums" or "museums in Ohio" or "museums (Ohio)" etc.
Daniel Mayer said:
--- Timwi timwi@gmx.net wrote:
Stan Shebs wrote:
there are a couple of different approaches. One is to create categories for every conjunction of categories - if you
have
"hotels" and "museums" and "Ohio", that means you should have
"hotels
in Ohio" and "museums in Ohio" categories.
I favour this approach. Without it, [[Category:Museums]] would contain *all* museums in the world (millions!), and [[Category:Ohio]] would contain absolutely *everything* about Ohio (cities, museums, exhibitions, operas, theatres, cinemas, railways, bus lines, sightseeing, TV stations, and all sorts of other unrelated things).
That is simply bad database design. A user should be able to select [Museums] WITHIN [Ohio] or just [Museums] or even just [Ohio]. Otherwise different tags will be needed for all possible combinations of museums, sport stadiums, or amusement parks in every single area of the world. It would be much better to have tags for the places and tags for things. Let the user decide what combination of variables he or she wants to select. That will require an advanced search capability to be added to MediaWiki, but I assume that was planned anyway.
-- Daniel Mayer (aka mav)
Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ _______________________________________________ WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l
On 06/07/04 at 02:09 PM, "Matthew Trump" wikipedia@decumanus.com said:
I think the main thrust of categories right now is that perhaps they are becomng very quickly fine-grained (Fine grained categories are just more interesting perhaps).
Exactly, and worse, not well documented.
Ideally, it would have best to have only coarse-grained categories at first (using some kind of community process like Anthere described), creating finer-grained ones only later.
Yes, as has been proposed elsewhere, this should be the first task of the bespoke article validation commitees.
I realize that this is hindsight, but I think in the long run something like you propose would be very good and may eventually result in the fine-grained categories being replaced by more convenient coarse-grained ones. It would definitely make searching much more easy, if you could choose from "Ohio" and "museums" rather than having to search a much longer list for "museums of Ohio" or "Ohio museums" or "museums in Ohio" or "museums (Ohio)" etc.
Indeed, at the risk of sounding simplistic, what is the point of fine-grained categories? If you do a full-text search on "ohio museums OR galeries" or "boston bruins player" doesn't that serve the purpose? The number of hits is manageable; how many museums are there in Ohio or hockey players from Boston worth an article anyway?
V.
Mentions Wikipedia protection of "nigritude ultramarine". I don't think this is really a problem for right now (especially) in the Sandbox context, but as always it's good to keep abreast of trends
http://slashdot.org/articles/04/06/07/1623244.shtml?tid=111&tid=126&...
On Monday 07 June 2004 20:13, Daniel Mayer wrote:
--- Timwi timwi@gmx.net wrote:
Stan Shebs wrote:
there are a couple of different approaches. One is to create categories for every conjunction of categories - if you have "hotels" and "museums" and "Ohio", that means you should have "hotels in Ohio" and "museums in Ohio" categories.
I favour this approach. Without it, [[Category:Museums]] would contain *all* museums in the world (millions!), and [[Category:Ohio]] would contain absolutely *everything* about Ohio (cities, museums, exhibitions, operas, theatres, cinemas, railways, bus lines, sightseeing, TV stations, and all sorts of other unrelated things).
That is simply bad database design. A user should be able to select [Museums] WITHIN [Ohio] or just [Museums] or even just [Ohio]. Otherwise different tags will be needed for all possible combinations of museums, sport stadiums, or amusement parks in every single area of the world. It would be much better to have tags for the places and tags for things. Let the user decide what combination of variables he or she wants to select. That will require an advanced search capability to be added to MediaWiki, but I assume that was planned anyway.
Haven't I just wrote that I have a patch which does exactly that? Category:Museums/Ohio returns all pages in Category:Museums and Category:Ohio.
At 02:52 PM 6/6/2004 +0300, Viajero wrote:
Obviously this page huge can't be used for looking up existing categories, if only because it would place a tremendous burden on the servers.
Including categories in the search engine would be nice, assuming the search engine was enabled. :) For now, I've been mostly just going to articles which IMO are likely to have similar categories to the ones I'm looking for and hoping that they've been categorized already. Or going to one of the "root" categories and trying to follow the hierarchy down to the specific ones I'm interested in.
- Boston Bruins players
What can possibly be the use of such narrow categories with only a handful of entries? Shouldn't we be aiming for broad categories (ie, albums, tv stations, awards, airports, museums, cuisines, atheletes)?
Broad categories exist as well. These small, highly-specific categories fit under them as subcategories, which allows them to be grouped in very flexible ways. For example, the "Boston Bruins players" category could fall under the "Boston Bruins" category, which could fall under the "American football teams" category, which could fall under the "American football" category, so that if one wanted to grab a list of "all American football-related articles" one could recursively include the subcategories of "American football" and all those players' articles would be there. But "Boston Bruins" could also fall under the category "Boston", so those players would show up if one grabbed a list of "all Boston-related articles". They can also fall under the category "Team sports players", which could fall under the category "Sports players", which could fall under the category "Athletes". So a list of all athletes on Wikipedia would include the Boston Bruins players as well. (note: Wikipedia seems to be down right now, so I'm just making these details up hypothetically)
The alternative would be to give each of the Boston Bruins players the categories "American football", "Boston", and "Athletes", which would get extremely messy and require much fancier queries if, say, you wanted a list of articles about American football teams without including articles about the particular players.
On 06/06/04 at 12:58 PM, Bryan Derksen bryan.derksen@shaw.ca said:
For example, the "Boston Bruins players" category could fall under the "Boston Bruins" category, which could fall under the "American football teams" category, which could fall under the "American football" category, so that if one wanted to grab a list of "all American football-related articles" one could recursively include the subcategories of "American football" and all those players' articles would be there.
You meant hockey of course. The Bruins are a hockey team. ;-)
V. (Bostonian by birth)
At 09:10 PM 6/6/2004 +0300, Viajero wrote:
On 06/06/04 at 12:58 PM, Bryan Derksen bryan.derksen@shaw.ca said:
For example, the "Boston Bruins players" category could fall under the "Boston Bruins" category, which could fall under the "American football teams" category, which could fall under the "American football" category, so that if one wanted to grab a list of "all American football-related articles" one could recursively include the subcategories of "American football" and all those players' articles would be there.
You meant hockey of course. The Bruins are a hockey team. ;-)
When Wikipedia is down it's like a hemisphere of my brain is missing. :)
On Sunday 06 June 2004 13:52, Viajero wrote:
Thanks to various responses to my earlier post on the subject, I have come to appreciate some of the possibilities of Categories, at least in theory. So, this morning I decided to categorize the forty or so articles on writers which I created or to which I made major contributions, having seen several suitable Categories show up in my Watchlist for similar articles. To start with, I wanted to know what categories have already been created for writers and journalists. For example, I saw a category for for "Argentine writers" (the article on Borges I think). Does this mean I should also create a category for Uruguayan, Mexican, and Chilean writers (I wrote up a couple) even though such categories would have one or two entries? However, there doesn't seem to be any comprehensive, hierarchical index of categories. [[Wikipedia:Categorization]] should serve this function, but it is woefully incomplete. [[Category:Main page]] is even worse. So, I turned to
Initially, I imagined that categories would be organized differently and that most articles would belong to several categories instead of a single subcategory. I even wrote a patch which enables displaying of articles that belong to multiple categories, for example, Category:Mexican/Writer would return all articles which are in both Category:Mexican and Category:Writer (this could be streched even further, and you could have, for example, Category:Female/American/Singer/Blonde). But I guess that developers were too busy with installing new version of the software to look at it.
Viajero wrote:
They aren't even consistently labelled! It is entirely possible, perhaps probable that someone will come along and create "Category:Contraltos opera singers" "Category:Tenors".
I have thought of this problem before, and I have come up with a solution, so here it is. Discussion is encouraged.
I think it should be possible to create a redirect between categories:
CATEGORY:TENORS --------------- 1. REDIRECT [[Category:Tenor opera singers]]
(nothing new so far) and then, either one of the following things should happen:
1. The category [[Category:Tenor opera singers]] should display all articles that are categorised in categories that redirect to it. This is probably too database-intensive because it potentially involves several database read queries;
2. Saving an article with [[Category:Tenors]] in it should automatically replace it with [[Category:Tenor opera singers]] before storing in the DB. This would only involve one extra read query per saved article that has a category tag.
The latter option has the added bonus that we can then allow categories to be moved using the "Move" function, which automatically generates a redirect. The new category name won't immediately display all the articles, but the next time those articles are edited, the category tag will automatically be updated.
Happy implementing! ;-) Timwi
Viajero wrote:
Maybe I have been out of the loop but just what is the purpose of Categories?
Just now I noticed that someone went through a bunch of pages I was watching and added the cateogry "Opera Composer" which seems entirely logical for the relevant articles, except that we have already have a page [[List of opera composers]]. Does this mean that this article is redundant? My personal instinct, which may or may not be relevant here, is not to do things in two different ways when one suffices. Will people continue to update the handmade lists now that Categories pages are available in such cases of redundancy as this? If not, they should be deleted, as outdated lists are a help to nobody.
Yes, completely-filled-in and alphabetized but otherwise bare lists don't offer much over a category, but the most useful lists add annotations that help the reader find a desired entry without having to click on every entry just to see what's there. I think it will be a while before something as elaborate (and useful) as [[list of battles]] can be auto-constructed from category tags!
Stan