On 6/5/06, Steve Bennett stevagewp@gmail.com wrote:
On 6/5/06, Anthony DiPierro wikilegal@inbox.org wrote:
You could always put "See also: [[:Category:The Beatles]]" (I think that's the syntax) in the description for [[Category:British rock bands]].
That's probably not bad. The page would have a basic structure like:
Description Related categories <-- new Subcategories Articles in this category
People *want* to put "related categories", but they break the category system if they make them subcats. We need to channel that desire.
Steve
While playing around with perl and [[Category:Airports]] last night I noticed an instance of what will probably be another common one: [[Category:Lists of airports]] is a subcategory. There are probably enough instances of that sort of thing that we would have to make an exception (in the interest of consensus), but the cleaner solution would be to go with something like the related categories idea above.
I should also note that [[Category:Airports]] itself is treated like a theme for articles, but the subcategories are treated like attributes. (Actually, now that I look at it directly on Wikipedia I see "Airport lounges" and "Airport operators" are also subcategories. I didn't notice that in my original, very fast, skim of the tree, but looking back it *was* there).
If anyone wants to take a look at my "tree" for airports, or my perl script (which is really simplistic), let me know where I can upload it. To run it you need to download and import two mysql database files, enwiki-20060518-categorylinks.sql (290 megs) and enwiki-20060518-page.sql (354 megs). The "tree" looks like this:
Airports_by_country *Airports_in_Afghanistan *#Bagram_Air_Base *#Kabul_International_Airport *Airports_in_Albania *#Rinas_Mother_Teresa_Airport [...] *Airports_in_Australia **Royal_Australian_Air_Force_bases ***Former_RAAF_Bases ***#RAAF_Station_Archerfield ***#RAAF_Station_Bairnsdale ***#RAAF_Base_Rathmines ***#RAAF_Station_Tocumwal **#RAAF_Base_Amberley **#RAAF_Bare_Bases
Are air force bases airports? I'd say so.
I also tried making a tree for [[Category:Buildings and structures]] (a "parent" of airports). That one grew quite messy, and my script bailed at 10 levels of recursion, so I haven't really analysed it much. I'll try turning it up to 20 or 25 and see what happens.
Anthony
On 6/5/06, Anthony DiPierro wikilegal@inbox.org wrote:
I also tried making a tree for [[Category:Buildings and structures]] (a "parent" of airports). That one grew quite messy, and my script bailed at 10 levels of recursion, so I haven't really analysed it much. I'll try turning it up to 20 or 25 and see what happens.
Well, that problem became obvious really quick.
Coastal_construction *Ports_and_harbours **Port_cities ***Edinburgh ****Education_in_Edinburgh *****University_of_Edinburgh ******University_of_Edinburgh_alumni *******Arthur_Conan_Doyle ********Sherlock_Holmes *********Arthur_Conan_Doyle **********Sherlock_Holmes
Guess I have to add in some sort of loop detection. Of course, "Port cities" shouldn't be in "Ports and harbours", "Edinburgh" shouldn't be in "Port cities", and "Arthur Conan Doyle" shouldn't be in "University of Edinburgh alumni" :).
Anthony
On 6/5/06, Anthony DiPierro wikilegal@inbox.org wrote:
Guess I have to add in some sort of loop detection. Of course, "Port cities" shouldn't be in "Ports and harbours", "Edinburgh" shouldn't be in "Port cities", and "Arthur Conan Doyle" shouldn't be in "University of Edinburgh alumni" :).
Meanwhile, "Arthur Conan Doyle" shouldn't be a category, and if it is, it's a thematic one and should not be a subcategory of Alumni of any university.
I can understand the confusion Doyle/Sherlock holmes though. Logic dictates that it's Doyle (or "Works of Arthur Conan Doyle") that's the supercategory though.
Steve
Steve Bennett wrote:
On 6/5/06, Anthony DiPierro wikilegal@inbox.org wrote:
Guess I have to add in some sort of loop detection. Of course, "Port cities" shouldn't be in "Ports and harbours", "Edinburgh" shouldn't be in "Port cities", and "Arthur Conan Doyle" shouldn't be in "University of Edinburgh alumni" :).
Meanwhile, "Arthur Conan Doyle" shouldn't be a category, and if it is, it's a thematic one and should not be a subcategory of Alumni of any university.
I can understand the confusion Doyle/Sherlock holmes though. Logic dictates that it's Doyle (or "Works of Arthur Conan Doyle") that's the supercategory though.
Way back in the mists of history when categories were first implemented I created a couple of templates intended to be put onto the category pages to identify whether the category contained articles that were examples of the category's subject or articles that were just _about_ the category's subject. There seemed to be no interest in using them and I didn't think it important enough to raise a fuss about, so I figured I'd just sit back and watch how categorization actually got used rather than trying to impose my vision on it.
Perhaps it'd be useful to recreate similar templates now, though, if enough people think it's a problem? That way there'd be no major disruption to the category tree, but people who wanted to do fancy culling of subsets of articles could add just a little parsing intelligence to whatever program they're using to determine what types of categories they're dealing with.
On 6/5/06, Bryan Derksen bryan.derksen@shaw.ca wrote:
Way back in the mists of history when categories were first implemented I created a couple of templates intended to be put onto the category pages to identify whether the category contained articles that were examples of the category's subject or articles that were just _about_ the category's subject. There seemed to be no interest in using them and I didn't think it important enough to raise a fuss about, so I figured I'd just sit back and watch how categorization actually got used rather than trying to impose my vision on it.
Perhaps it'd be useful to recreate similar templates now, though, if enough people think it's a problem? That way there'd be no major disruption to the category tree, but people who wanted to do fancy culling of subsets of articles could add just a little parsing intelligence to whatever program they're using to determine what types of categories they're dealing with.
This does sound like a good start, but will need some intelligence in its application. One of the problems is that good categorisation is somewhat challenging, and the "average person" won't necessarily make good use of these templates. In fact the same could probably be said for all semantic markup, of which this is an example.
Do you have any examples? What would they look like? Perhaps:
{{thematic category|name of subject}} --> This category should be applied to all articles which have a strong link with <name of subject> and
{{taxonomic category|type of thing|thematic category}} --> This category should be applied to articles which are an example of a <type of thing>, and which do not belong to a subcategory. Articles which are just related to this topic should go in [[:Category:<Thematic category>]] instead.
Anyone want to mock one up?
Steve
Steve Bennett wrote:
Do you have any examples? What would they look like? Perhaps:
{{thematic category|name of subject}}
I don't think we should make things unnecessarily complicated by making the templates take parameters, just use {{PAGENAME}}. If the category's name isn't suitable for use in the template describing it then I suspect it's an indication that the category's name might need changing. :)
How about "Members of this category should be specific examples of {{PAGENAME}}" and "Members of this category or its subcategories should be about the topic of {{PAGENAME}}"? I'm deliberately leaving the "or its subcategories" off of the is-a template, so that we can continue having things like [[Category:Seattle]] be a subcategory of [[Category:Cities in Washington]] (or whatever the real category names are) like we do now.
On 6/6/06, Bryan Derksen bryan.derksen@shaw.ca wrote:
Steve Bennett wrote:
Do you have any examples? What would they look like? Perhaps:
{{thematic category|name of subject}}
I don't think we should make things unnecessarily complicated by making the templates take parameters, just use {{PAGENAME}}. If the category's name isn't suitable for use in the template describing it then I suspect it's an indication that the category's name might need changing. :)
Mmm...you may be right. In any case, the text on the category page should provide enough context to clarify. There may be cases where a category is forced to obey a particular naming scheme which isn't the most descriptive for what it is, but I can't think of any immediately.
How about "Members of this category should be specific examples of {{PAGENAME}}"
I'll try a couple of random examples: Members of this category should be specific examples of Districts of Berlin. Members of this category should be specific examples of Billboard Hot 100 number-one singles Members of this category should be specific examples of Russian and Soviet polar explorers [an example where the text would be better as "Russian *or* Soviet polar explorers"?] Members of this category should be specific examples of Cities in Kentucky Members of this category should be specific examples of Pakistan movement activists Members of this category should be specific examples of Lists of state leaders by year [this article also had the presumably "thematic" category 730s - a counterexample to the plurals rule] Members of this category should be specific examples of United Kingdom court systems Members of this category should be specific examples of Members and associates of the US National Academy of Sciences [another and -> or] Members of this category should be specific examples of Rivers of Indonesia ["Rivers in Indonesia" would be more natural] Members of this category should be specific examples of Pejorative names for people.
I guess in general it works, but some other ideas: Articles in this category should describe individual Articles in this category should be Only ___ should be in this category. Only articles about [specific | individual] ___ should be in this category.
Also note there is a capitalisation issue to deal with... There is also a problem with "1923 births" type cats - these don't fit comfortably into any of these sentences.
(incidentally, with this distinction between taxonomic and thematic, is it time to up the ante and say that every article should have at least one taxonomic category? by random article I come to "Chrysler Phaeton" which only has the thematic category "Chrysler" - whereas it really should have at least the taxonomic "Cars" or something)
and "Members of this category or its subcategories should be about the topic of {{PAGENAME}}"?
Members of this category or its subcategories should be about the topic of... Education in New York ...Huntingdon County, Pennsylvania ...Oslo T-bane ...Baseball ...Isle of Man
I don't think it works - [[The Buchan School]] is not "about the topic of The Isle of Man". What is the relationship exactly? "should have a strong connection with"?
I'm deliberately leaving the "or its subcategories" off of the is-a template, so that we can continue having things like [[Category:Seattle]] be a subcategory of [[Category:Cities in Washington]] (or whatever the real category names are) like we do now.
I think we just decided in the long thread that that's a bad idea as follows: Cities in Washington is a taxonomic category Seattle is a thematic category Thematic categories should never be subcategories of taxonomic categories (although the reverse is ok)
If your proposal is to allow for a graceful changeover, then I understand.
Steve
Steve Bennett wrote:
On 6/6/06, Bryan Derksen bryan.derksen@shaw.ca wrote:
I'm deliberately leaving the "or its subcategories" off of the is-a template, so that we can continue having things like [[Category:Seattle]] be a subcategory of [[Category:Cities in Washington]] (or whatever the real category names are) like we do now.
I think we just decided in the long thread that that's a bad idea as follows: Cities in Washington is a taxonomic category Seattle is a thematic category Thematic categories should never be subcategories of taxonomic categories (although the reverse is ok)
If your proposal is to allow for a graceful changeover, then I understand.
I don't think a "changeover" is needed, and I don't see why thematic categories being subcategories of taxonomic categories is a bad thing provided the categories are all properly labelled. The way things are now are pretty intuitive for human browsing, which makes sense since humans are the ones that organized it the way it is. But if you want to spider a collection of articles on a taxonomic basis (for example building a list of all cities in the United States with articles) then all you need to do is write the spider to ignore subcategories that aren't labelled as taxonomic. It'll ignore [[Category:Seattle]] and all the contents thereof.
Seems to me that forbidding thematic subcategories to taxanomic categories will result in a huge number of orphaned thematic categories.
On 6/6/06, Bryan Derksen bryan.derksen@shaw.ca wrote:
I don't think a "changeover" is needed, and I don't see why thematic categories being subcategories of taxonomic categories is a bad thing provided the categories are all properly labelled. The way things are now are pretty intuitive for human browsing, which makes sense since humans are the ones that organized it the way it is. But if you want to spider a collection of articles on a taxonomic basis (for example building a list of all cities in the United States with articles) then all you need to do is write the spider to ignore subcategories that aren't labelled as taxonomic. It'll ignore [[Category:Seattle]] and all the contents thereof.
That seems reasonable.
Steve