Hi all,
The English Wikipedia categorises biographies by gender in some circumstances (eg athletes), but not systematically in the way that German does - there are no supercategories of "Men", "Women", etc, designed to list all members of those groups, and plenty of biography articles have no "gendered" categories. There are, of course, good reasons to avoid this, and conversely good reasons to do it... but I'm wondering why we do it this way.
I remember it being referred to many years ago as long-standing practice, but I've dug around a bit in the discussion archives and can't seem to pin it down. It's probably pre-2004, maybe even pre-2003 - anyone remember?
On 18 July 2012 10:47, Andrew Gray andrew.gray@dunelm.org.uk wrote:
Hi all,
The English Wikipedia categorises biographies by gender in some circumstances (eg athletes), but not systematically in the way that German does - there are no supercategories of "Men", "Women", etc, designed to list all members of those groups, and plenty of biography articles have no "gendered" categories. There are, of course, good reasons to avoid this, and conversely good reasons to do it... but I'm wondering why we do it this way.
Hmm, you really want to go there? What is ultimately verifiable has to take account of [[Category:Sex chromosome aneuploidies]]. Not to speak of other intrusions.
Charles
On 18 July 2012 10:47, Andrew Gray andrew.gray@dunelm.org.uk wrote:
I remember it being referred to many years ago as long-standing practice, but I've dug around a bit in the discussion archives and can't seem to pin it down. It's probably pre-2004, maybe even pre-2003
- anyone remember?
As with almost all our category system, it's basically ad hoc. I suggest if you can propose something not insane to relevant wikiprojects and are prepared to do the bot work yourself, you can have endless fun clicking "save" in AWB for a few hours.
- d.
On Wed, Jul 18, 2012 at 11:04 AM, David Gerard dgerard@gmail.com wrote:
On 18 July 2012 10:47, Andrew Gray andrew.gray@dunelm.org.uk wrote:
I remember it being referred to many years ago as long-standing practice, but I've dug around a bit in the discussion archives and can't seem to pin it down. It's probably pre-2004, maybe even pre-2003
- anyone remember?
As with almost all our category system, it's basically ad hoc. I suggest if you can propose something not insane to relevant wikiprojects and are prepared to do the bot work yourself, you can have endless fun clicking "save" in AWB for a few hours.
German Wikipedia has categories Man, Woman, Intersexual, and "Gender unknown". Works perfectly well for all biographies.
Magnus
On 7/18/12, David Gerard dgerard@gmail.com wrote:
On 18 July 2012 10:47, Andrew Gray andrew.gray@dunelm.org.uk wrote:
I remember it being referred to many years ago as long-standing practice, but I've dug around a bit in the discussion archives and can't seem to pin it down. It's probably pre-2004, maybe even pre-2003
- anyone remember?
As with almost all our category system, it's basically ad hoc. I suggest if you can propose something not insane to relevant wikiprojects and are prepared to do the bot work yourself, you can have endless fun clicking "save" in AWB for a few hours.
For 1,000,000 articles? I think it should be done, but it will take more than a few hours. I think it could be done very quickly, if lots of people got involved. And I don't think the cases where it is unclear or a matter of privacy (a vanishingly small number) should preclude the obvious cases being done. It doesn't seem quite right that the potential for arguments over edge cases and how to handle them sensitively, would preclude being able to search by gender.
For instance, the ODNB online allows you to search by gender: with the options male, female and family/group. The latter is only 420 articles. For female you get 6,265 articles, and for male you get 51,940. It would be nice to do the same for Wikipedia's biographies, distinguishing between groups and single articles, and between men and women (and other genders).
Examples of discussions include this:
http://en.wikipedia.org/wiki/Wikipedia:Categories_for_deletion/Log/2005_June...
And this:
http://en.wikipedia.org/wiki/Wikipedia_talk:Categorization/Ethnicity,_gender...
But that is around 2005. Not looked earlier than that yet.
Carcharoth
On 18 July 2012 11:56, Carcharoth carcharothwp@googlemail.com wrote:
And I don't think the cases where it is unclear or a matter of privacy (a vanishingly small number) should preclude the obvious cases being done. It doesn't seem quite right that the potential for arguments over edge cases and how to handle them sensitively, would preclude being able to search by gender.
Well, OK. As far as I can see, the standard infobox for people does not include a field M/F. Within [[Category:Actors]], where the WikiProject specifies use of an infobox, it seems that the gendered occupation "actress" is still widely used. I had assumed this was obsolescent, as "poetess" and "authoress" are now obsolete. It would be an improvement to use "actor" throughout and make M/F either an field in the infobox, or a category, or both.
I just think we should contemplate what is going on here. A couple of points are:
* information in infoboxes and categories is subject to WP:V like everything else, though stuff is often enough sneaked in (nnot good); and *the "German" style of categorisation, per previous discussion here, has both advantages and disadvantages.
In the light of Wikidata, an infobox route whereby V for infobox entries is applied "centrally" now looks good. Native speakers tend to be able to read gender from forenames (not in all cases, though). We should have a well-considered case for flagging it to non-native speakers.
Charles
On 18 July 2012 12:13, Charles Matthews charles.r.matthews@ntlworld.com wrote:
Well, OK. As far as I can see, the standard infobox for people does not include a field M/F. Within [[Category:Actors]], where the WikiProject specifies use of an infobox, it seems that the gendered occupation "actress" is still widely used. I had assumed this was obsolescent, as "poetess" and "authoress" are now obsolete. It would be an improvement to use "actor" throughout and make M/F either an field in the infobox, or a category, or both.
Make gender a text field, that should satisfy all plausible edge cases while covering the common two cases.
- d.
"Actress" is certainly not obsolescent in common usage, and I would suggest it is not the role of Wikipedia to redefine the English language.
At least until the Academy changes the name of its award for performance by an actress in a leading role...
On 18 Jul 2012, at 12:13, Charles Matthews charles.r.matthews@ntlworld.com wrote:
Well, OK. As far as I can see, the standard infobox for people does not include a field M/F. Within [[Category:Actors]], where the WikiProject specifies use of an infobox, it seems that the gendered occupation "actress" is still widely used. I had assumed this was obsolescent, as "poetess" and "authoress" are now obsolete. It would be an improvement to use "actor" throughout and make M/F either an field in the infobox, or a category, or both.
On 18 July 2012 12:32, james.farrar@gmail.com wrote:
"Actress" is certainly not obsolescent in common usage, and I would suggest it is not the role of Wikipedia to redefine the English language.
The point here is whether "occupation" is gendered, though, in this case. Cf. firefighter, seafarer and so on.
Charles
On 18 July 2012 13:03, Charles Matthews charles.r.matthews@ntlworld.comwrote:
On 18 July 2012 12:32, james.farrar@gmail.com wrote:
"Actress" is certainly not obsolescent in common usage, and I would
suggest it is not the role of Wikipedia to redefine the English language.
The point here is whether "occupation" is gendered, though, in this case. Cf. firefighter, seafarer and so on.
Interesting that you pick two occupations where, prior to more modern times, female participation has been fairly non-existent.
Tom
On Jul 18, 2012 8:56 PM, "Carcharoth" carcharothwp@googlemail.com wrote:
On 7/18/12, David Gerard dgerard@gmail.com wrote:
On 18 July 2012 10:47, Andrew Gray andrew.gray@dunelm.org.uk wrote:
I remember it being referred to many years ago as long-standing practice, but I've dug around a bit in the discussion archives and can't seem to pin it down. It's probably pre-2004, maybe even pre-2003
- anyone remember?
As with almost all our category system, it's basically ad hoc. I suggest if you can propose something not insane to relevant wikiprojects and are prepared to do the bot work yourself, you can have endless fun clicking "save" in AWB for a few hours.
For 1,000,000 articles? I think it should be done, but it will take more than a few hours. I think it could be done very quickly, if lots of people got involved.
Laura and I did it for Australian sports people. It is time consuming as category structures need to be created.
And I don't think the cases where it is unclear or a matter of privacy (a vanishingly small number) should preclude the obvious cases being done. It doesn't seem quite right that the potential for arguments over edge cases and how to handle them sensitively, would preclude being able to search by gender.
When used in category intersections, its really useful info for gender studies.
-- JV
On 29 July 2012 22:53, John Vandenberg jayvdb@gmail.com wrote:
And I don't think the cases where it is unclear or a matter of privacy (a vanishingly small number) should preclude the obvious cases being done. It doesn't seem quite right that the potential for arguments over edge cases and how to handle them sensitively, would preclude being able to search by gender.
When used in category intersections, its really useful info for gender studies.
Indeed. I would love to see a chart of gender-in-Wikipedia compared to birth years, for example, over the last few centuries. Even in the simplest cases (pure coverage numbers) it's an interesting tool for understanding our coverage and how that reflects systemic biases.
I'm interested to hear that you categorised for a large set - did it work smoothly, labour-intensiveness aside, and was there any pushback?
No pushback. .. yet. .. ;) Just a few justified complaints when i made mistakes.
I only categorised women...leaving un-gender-categorised articles as 'men' and then deduced the percentage of women from that.
I figured that the chances of pushback were higher if I 'touched' lots of articles and cats about men ;) Those edits would be seen as less useful as in many cases there are very few women with articles in male dominated sports. E.g. Rugby, where the female leagues dont attract usable levels of press coverage, making notability difficult, and limiting the number of people likely to care enough to write a bio.
I did have a complaint that I wasnt categorising men as well, and i think i promised to do that, but havent done so. On Jul 30, 2012 8:15 AM, "Andrew Gray" andrew.gray@dunelm.org.uk wrote:
On 29 July 2012 22:53, John Vandenberg jayvdb@gmail.com wrote:
And I don't think the cases where it is unclear or a matter of privacy (a vanishingly small number) should preclude the obvious cases being done. It doesn't seem quite right that the potential for arguments over edge cases and how to handle them sensitively, would preclude being able to search by gender.
When used in category intersections, its really useful info for gender studies.
Indeed. I would love to see a chart of gender-in-Wikipedia compared to birth years, for example, over the last few centuries. Even in the simplest cases (pure coverage numbers) it's an interesting tool for understanding our coverage and how that reflects systemic biases.
I'm interested to hear that you categorised for a large set - did it work smoothly, labour-intensiveness aside, and was there any pushback?
--
- Andrew Gray andrew.gray@dunelm.org.uk
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
On 7/18/12 11:47 AM, Andrew Gray wrote:
The English Wikipedia categorises biographies by gender in some circumstances (eg athletes), but not systematically in the way that German does - there are no supercategories of "Men", "Women", etc, designed to list all members of those groups, and plenty of biography articles have no "gendered" categories. There are, of course, good reasons to avoid this, and conversely good reasons to do it... but I'm wondering why we do it this way.
I remember it being referred to many years ago as long-standing practice, but I've dug around a bit in the discussion archives and can't seem to pin it down. It's probably pre-2004, maybe even pre-2003
- anyone remember?
My vaguely informed guess as to why is that English-Wikipedia categories have developed mainly as a folksonomy intended for navigation, as opposed to a rational, top-down taxonomy intended for sorting things into bins, which is closer to how the German Wikipedia does it. Not universally true, but it's their general flavor.
Many of the "Women in X" categories, for example, are maintained by WikiProject Women's History. They can be useful for navigation in contexts related to the WikiProject or some of its goals. For example, students looking for a subject to write about during a Women's History Month assignment might find a category like [[Category:Women astronomers]] useful for navigation.
From that perspective, why there aren't equivalent "Men in X" categories is related to why there isn't a WikiProject Men's History, or a Men's History Month: basically, men have not been as systematically left out of many professions and histories, so there is less interest in or need to focus specifically on "Men astronomers" in order to emphasize their overlooked contributions. For similar reasons, we have categories such as [[Category:African-American inventors]], but not [[Category:White American inventors]].
I'm not sure if that's the best way to do it, but I think that asymmetry in interest and navigational usefulness is why we have some asymmetries in the category structure. As for changing it, I think it'll have to be looked at on an area-by-area basis with involvement of relevant wikiprojects, because some of the category systems are fairly complex and/or brittle, and people have opinions about them. In sports, for example, many people are already categorized into the leagues they play in, and many leagues are single-gender, so that could provide an easy way of adding people indirectly to a category without going through an editing tens of thousands of articles.
Alternately (or perhaps, additionally), there are increasingly more ways than the category system for encoding metadata, if the goal is to use it for external sorting rather than navigation. For example, perhaps Template:Infobox_person could have a gender field, which would then be picked up by DBPedia and similar projects that extract infobox data.
-Mark
On Wednesday, 18 July 2012 at 13:18, Delirium wrote:
I'm not sure if that's the best way to do it, but I think that asymmetry in interest and navigational usefulness is why we have some asymmetries in the category structure. As for changing it, I think it'll have to be looked at on an area-by-area basis with involvement of relevant wikiprojects, because some of the category systems are fairly complex and/or brittle, and people have opinions about them. In sports, for example, many people are already categorized into the leagues they play in, and many leagues are single-gender, so that could provide an easy way of adding people indirectly to a category without going through an editing tens of thousands of articles.
Alternately (or perhaps, additionally), there are increasingly more ways than the category system for encoding metadata, if the goal is to use it for external sorting rather than navigation. For example, perhaps Template:Infobox_person could have a gender field, which would then be picked up by DBPedia and similar projects that extract infobox data.
Funny you should mention DBpedia. DBpedia can only work based on the things in Wikipedia and given that we don't include gender in Wikipedia info boxes or category structures, there won't be anything in DBpedia.
But, DBpedia links into Freebase, and Freebase has been running a game through the 'Freebase apps' platform called "Genderizer". This allows people to select either from a queue of real or fictional people and set their gender based on the lead from their Wikipedia article. While this isn't a reliable source to integrate the information back into Wikipedia, for the purposes of doing a rough study into the gender ratios of Wikipedia articles about people (and fictional people), Freebase may do what you want.
On 7/18/12, Tom Morris tom@tommorris.org wrote:
Funny you should mention DBpedia. DBpedia can only work based on the things in Wikipedia and given that we don't include gender in Wikipedia info boxes or category structures, there won't be anything in DBpedia.
But, DBpedia links into Freebase, and Freebase has been running a game through the 'Freebase apps' platform called "Genderizer". This allows people to select either from a queue of real or fictional people and set their gender based on the lead from their Wikipedia article. While this isn't a reliable source to integrate the information back into Wikipedia, for the purposes of doing a rough study into the gender ratios of Wikipedia articles about people (and fictional people), Freebase may do what you want.
Interesting. This could be documented somewhere in Wikipedia's documentation - is it? And do they cover famous animals as well?
http://en.wikipedia.org/wiki/Nils_Olav
Carcharoth