Here (http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-i...) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here (http://notconfusing.com/preliminary-results-from-wigi- the-wikipedia-gender-inequality-index/) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/ wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say, the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples: (1a) Female actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here. More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few professional categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic normalization" method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here (http://notconfusing.com/preliminary-results-from-wigi- the-wikipedia-gender-inequality-index/) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/ wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Dear h,
I think your "male gaze" hypothesis is very interesting. It's of course possible to report on the number of female politicians or such using Wikipedia/Wikidata, but comparisons to other indices are, well, problematic. World Forum Gender Gap index reports a ratio of politicians (female to male ratio is reported as Table E12: Women in parliament and Table E13: Women in ministerial positions in http://www3.weforum.org/docs/GGGR14/GGGR_CompleteReport_2014.pdf) . Unfortunately, those are high level positions and I'd expect that all but the smallest wikis would score well on coverage of those individuals for their own language. What we would need would be a count of, let's say, politicians by gender in general for those countries, and I don't recall any studies doing that in general, particularly on a internationally-comparative basis. GGI also has Table E4: Legislators, senior officials and managers, but it compounds politicians with bureaucrats and managers; through I guess it would be the best comparison we can do here (count the instances of word politician, official and manager in our data, compare it with the GGI indicator). Another way of doing so could involve seeing if there are differences in how A-language wiki reports on celebrities vs. politicians/officials/managers for different language, ex. would Japanese Wikipedia show any bias in covering Korean celebrities vs. Korean politicians/officials/managers or even Korean parliament members/ministers. Through some of that may be, as you suggest, material for separate paper(s). We will see if we can report on any of those differences. If you know of any better data to compare ours too, please don't hesitate to suggest it!
--
Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
On 1/11/2015 23:23, h wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to
say, the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply
here. More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few
professional categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic
normalization" method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography
articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com>:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles. I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture. Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM On 10 January 2015 at 11:51, Piotr Konieczny <piokon@post.pl <mailto:piokon@post.pl>> wrote: Here (http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-inequality-index/) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what? -- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I believe it would be easier to stick to the group of celebrities, so for example do a study of articles on films, especially romance categories that would usually have noteworthy subjects for both the lead man and the lead woman. It would be interesting to see if the bio coverage of both the male & the female leads was about the same per country. As far as Asian celebrities goes, did you look into Manga characters? I know most are fictional but many are based on real people, and especially the ones based on foreign real people could skew the results. Presumably someone writing a Wikipedia article about a manga based on real person would also go ahead and create a stub on the real person.
On Mon, Jan 12, 2015 at 12:36 PM, Piotr Konieczny piokon@post.pl wrote:
Dear h,
I think your "male gaze" hypothesis is very interesting. It's of course possible to report on the number of female politicians or such using Wikipedia/Wikidata, but comparisons to other indices are, well, problematic. World Forum Gender Gap index reports a ratio of politicians (female to male ratio is reported as Table E12: Women in parliament and Table E13: Women in ministerial positions in http://www3.weforum.org/docs/GGGR14/GGGR_CompleteReport_2014.pdf) . Unfortunately, those are high level positions and I'd expect that all but the smallest wikis would score well on coverage of those individuals for their own language. What we would need would be a count of, let's say, politicians by gender in general for those countries, and I don't recall any studies doing that in general, particularly on a internationally-comparative basis. GGI also has Table E4: Legislators, senior officials and managers, but it compounds politicians with bureaucrats and managers; through I guess it would be the best comparison we can do here (count the instances of word politician, official and manager in our data, compare it with the GGI indicator). Another way of doing so could involve seeing if there are differences in how A-language wiki reports on celebrities vs. politicians/officials/managers for different language, ex. would Japanese Wikipedia show any bias in covering Korean celebrities vs. Korean politicians/officials/managers or even Korean parliament members/ministers. Through some of that may be, as you suggest, material for separate paper(s). We will see if we can report on any of those differences. If you know of any better data to compare ours too, please don't hesitate to suggest it!
--
Piotr Konieczny, PhDhttp://hanyang.academia.edu/PiotrKoniecznyhttp://scholar.google.com/citation...
On 1/11/2015 23:23, h wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say,
the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here.
More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few
professional categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic
normalization" method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography
articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here ( http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-i...) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing listWiki-research-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re politicians, a trivial observation that is nonetheless better explicit than implicit:
While the World Forum stats are presumably snapshot stats of some current(ish) point in time, Wikipedias cover politicians past and present. This would, of course, skew results as heavily as the patriarchal practice of the past (and present, yes) disenfranchised women in/from politics.
A. On Jan 12, 2015 3:36 AM, "Piotr Konieczny" piokon@post.pl wrote:
Dear h,
I think your "male gaze" hypothesis is very interesting. It's of course possible to report on the number of female politicians or such using Wikipedia/Wikidata, but comparisons to other indices are, well, problematic. World Forum Gender Gap index reports a ratio of politicians (female to male ratio is reported as Table E12: Women in parliament and Table E13: Women in ministerial positions in http://www3.weforum.org/docs/GGGR14/GGGR_CompleteReport_2014.pdf) . Unfortunately, those are high level positions and I'd expect that all but the smallest wikis would score well on coverage of those individuals for their own language. What we would need would be a count of, let's say, politicians by gender in general for those countries, and I don't recall any studies doing that in general, particularly on a internationally-comparative basis. GGI also has Table E4: Legislators, senior officials and managers, but it compounds politicians with bureaucrats and managers; through I guess it would be the best comparison we can do here (count the instances of word politician, official and manager in our data, compare it with the GGI indicator). Another way of doing so could involve seeing if there are differences in how A-language wiki reports on celebrities vs. politicians/officials/managers for different language, ex. would Japanese Wikipedia show any bias in covering Korean celebrities vs. Korean politicians/officials/managers or even Korean parliament members/ministers. Through some of that may be, as you suggest, material for separate paper(s). We will see if we can report on any of those differences. If you know of any better data to compare ours too, please don't hesitate to suggest it!
--
Piotr Konieczny, PhDhttp://hanyang.academia.edu/PiotrKoniecznyhttp://scholar.google.com/citation...
On 1/11/2015 23:23, h wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say,
the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here.
More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few
professional categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic
normalization" method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography
articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here ( http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-i...) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing listWiki-research-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I have spent quite a bit of time at new page patrol over the years. My suspicion is that many if not most of the people who create articles on newly signed pop stars and actors are from their management agency rather than fans, especially if they seem too early in their career to have fans. Sportspeople I suggest are more likely to be written about by fans, especially if they have been signed by a major team, or more importantly for Wikipedia a team with an actively editing fan.
On this theory the quality of articles, the number of edits, and when we had the Article Feedback Tool the number of "is hot" type comments would be a good indication of interest from the volunteer editing community. But article creation is in part a matter of the policy of the relevant talent agencies.
Sorry if that sounds overly cynical, perhaps if it were possible one would filter out the articles that get scarcely any views and then look at the gender balance of articles that are of interest to our audience as well as our editors.
Regards
Jonathan Cardy
On 11 Jan 2015, at 22:23, h hanteng@gmail.com wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say, the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research. Note that East Asian Wikipedians that I know, especially those who edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex. Check the following category pages as examples:
(1a) Female actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here. More female presentation simply because they are there to be consumed by men or boys. So one of my suggestions for research is to select a few professional categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis. Thus, I will be extremely cautious against using the current metrics/methods as viable "gender inequality index". As a proponent of "data normalization" and "geographic normalization" method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies. BTW, the methods you develop to extract gender from biography articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote: Here (http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-i...) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Thank you all for the feedback. I will have taken away quite a few good ideas for further investigation, to summarize:
Gerard - look at the ratios of those bios of a language, which exist only in that language. Han Teng - "male gaze" hypothesis, create a by-profession crosstabular analysis. Jane - look at the ratios of leading actors by language, and "fictional humans" more closely. Jonathan - perform a filter step, or perhaps a weighting by page-views.
Thanks so much for the advice, what a great list.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Mon, Jan 12, 2015 at 5:57 AM, WereSpielChequers < werespielchequers@gmail.com> wrote:
I have spent quite a bit of time at new page patrol over the years. My suspicion is that many if not most of the people who create articles on newly signed pop stars and actors are from their management agency rather than fans, especially if they seem too early in their career to have fans. Sportspeople I suggest are more likely to be written about by fans, especially if they have been signed by a major team, or more importantly for Wikipedia a team with an actively editing fan.
On this theory the quality of articles, the number of edits, and when we had the Article Feedback Tool the number of "is hot" type comments would be a good indication of interest from the volunteer editing community. But article creation is in part a matter of the policy of the relevant talent agencies.
Sorry if that sounds overly cynical, perhaps if it were possible one would filter out the articles that get scarcely any views and then look at the gender balance of articles that are of interest to our audience as well as our editors.
Regards
Jonathan Cardy
On 11 Jan 2015, at 22:23, h hanteng@gmail.com wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say,
the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here.
More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few professional
categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic normalization"
method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography articles
for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here (http://notconfusing.com/preliminary-results-from-wigi- the-wikipedia-gender-inequality-index/) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/ wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
To spam this list as well as Twitter :-)
http://magnusmanske.de/wordpress/?p=250
On Tue, Jan 13, 2015 at 5:41 PM, Maximilian Klein isalix@gmail.com wrote:
Thank you all for the feedback. I will have taken away quite a few good ideas for further investigation, to summarize:
Gerard - look at the ratios of those bios of a language, which exist only in that language. Han Teng - "male gaze" hypothesis, create a by-profession crosstabular analysis. Jane - look at the ratios of leading actors by language, and "fictional humans" more closely. Jonathan - perform a filter step, or perhaps a weighting by page-views.
Thanks so much for the advice, what a great list.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Mon, Jan 12, 2015 at 5:57 AM, WereSpielChequers < werespielchequers@gmail.com> wrote:
I have spent quite a bit of time at new page patrol over the years. My suspicion is that many if not most of the people who create articles on newly signed pop stars and actors are from their management agency rather than fans, especially if they seem too early in their career to have fans. Sportspeople I suggest are more likely to be written about by fans, especially if they have been signed by a major team, or more importantly for Wikipedia a team with an actively editing fan.
On this theory the quality of articles, the number of edits, and when we had the Article Feedback Tool the number of "is hot" type comments would be a good indication of interest from the volunteer editing community. But article creation is in part a matter of the policy of the relevant talent agencies.
Sorry if that sounds overly cynical, perhaps if it were possible one would filter out the articles that get scarcely any views and then look at the gender balance of articles that are of interest to our audience as well as our editors.
Regards
Jonathan Cardy
On 11 Jan 2015, at 22:23, h hanteng@gmail.com wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say,
the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here.
More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few professional
categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic normalization"
method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography
articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here (http://notconfusing.com/preliminary-results-from-wigi- the-wikipedia-gender-inequality-index/) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/ wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I have a question about the P21.
Has any of the GND author sex information leaked into P21? because that's known-bad data.
It's bad because the GND in all it's wisdom decided to assign sex to authors based on a apparent gender of the name published under, even for periods when many women were known to publish using male pseudonyms. I believe that VIAF takes this data at face value, effectively poisoning pretty much every conceivable use of library authority data for gender analysis for the foreseeable future.
cheers stuart
-- ...let us be heard from red core to black sky
On Wed, Jan 14, 2015 at 10:07 AM, Magnus Manske magnusmanske@googlemail.com wrote:
To spam this list as well as Twitter :-)
http://magnusmanske.de/wordpress/?p=250
On Tue, Jan 13, 2015 at 5:41 PM, Maximilian Klein isalix@gmail.com wrote:
Thank you all for the feedback. I will have taken away quite a few good ideas for further investigation, to summarize:
Gerard - look at the ratios of those bios of a language, which exist only in that language. Han Teng - "male gaze" hypothesis, create a by-profession crosstabular analysis. Jane - look at the ratios of leading actors by language, and "fictional humans" more closely. Jonathan - perform a filter step, or perhaps a weighting by page-views.
Thanks so much for the advice, what a great list.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Mon, Jan 12, 2015 at 5:57 AM, WereSpielChequers werespielchequers@gmail.com wrote:
I have spent quite a bit of time at new page patrol over the years. My suspicion is that many if not most of the people who create articles on newly signed pop stars and actors are from their management agency rather than fans, especially if they seem too early in their career to have fans. Sportspeople I suggest are more likely to be written about by fans, especially if they have been signed by a major team, or more importantly for Wikipedia a team with an actively editing fan.
On this theory the quality of articles, the number of edits, and when we had the Article Feedback Tool the number of "is hot" type comments would be a good indication of interest from the volunteer editing community. But article creation is in part a matter of the policy of the relevant talent agencies.
Sorry if that sounds overly cynical, perhaps if it were possible one would filter out the articles that get scarcely any views and then look at the gender balance of articles that are of interest to our audience as well as our editors.
Regards
Jonathan Cardy
On 11 Jan 2015, at 22:23, h hanteng@gmail.com wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say,
the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here.
More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few professional
categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic normalization"
method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography
articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here (http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-i...) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Interesting, Magnus, thanks! After working on lots of the female names in various databases, I can also say that it is pretty difficult to scrape enough information together to produce a Wikipedia-worthy stub on many of the women mentioned in those databases. As you point out, we don't have paintings to illustrate their articles that indicate age and culture (such paintings may still exist, but they are lacking the name of the sitter by now). What we also lack is their birthdate and birthplace. Many women don't disclose this information as it is considered impolite. It is often lacking in historical sources as well. Pretty much all we know is that somebody had a mother, but the mother's name and heritage is often lost along the wayside.
On Tue, Jan 13, 2015 at 10:07 PM, Magnus Manske <magnusmanske@googlemail.com
wrote:
To spam this list as well as Twitter :-)
http://magnusmanske.de/wordpress/?p=250
On Tue, Jan 13, 2015 at 5:41 PM, Maximilian Klein isalix@gmail.com wrote:
Thank you all for the feedback. I will have taken away quite a few good ideas for further investigation, to summarize:
Gerard - look at the ratios of those bios of a language, which exist only in that language. Han Teng - "male gaze" hypothesis, create a by-profession crosstabular analysis. Jane - look at the ratios of leading actors by language, and "fictional humans" more closely. Jonathan - perform a filter step, or perhaps a weighting by page-views.
Thanks so much for the advice, what a great list.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Mon, Jan 12, 2015 at 5:57 AM, WereSpielChequers < werespielchequers@gmail.com> wrote:
I have spent quite a bit of time at new page patrol over the years. My suspicion is that many if not most of the people who create articles on newly signed pop stars and actors are from their management agency rather than fans, especially if they seem too early in their career to have fans. Sportspeople I suggest are more likely to be written about by fans, especially if they have been signed by a major team, or more importantly for Wikipedia a team with an actively editing fan.
On this theory the quality of articles, the number of edits, and when we had the Article Feedback Tool the number of "is hot" type comments would be a good indication of interest from the volunteer editing community. But article creation is in part a matter of the policy of the relevant talent agencies.
Sorry if that sounds overly cynical, perhaps if it were possible one would filter out the articles that get scarcely any views and then look at the gender balance of articles that are of interest to our audience as well as our editors.
Regards
Jonathan Cardy
On 11 Jan 2015, at 22:23, h hanteng@gmail.com wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say,
the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here.
More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few
professional categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic
normalization" method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography
articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here (http://notconfusing.com/preliminary-results-from-wigi- the-wikipedia-gender-inequality-index/) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/ wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
We are very happy to hear that your findings corroborate ours. What's your take on what we think is rather surprising high percentage of female bios for the South Asian/Confucian clusters? (i.e. non-Islamic Asia)?
--
Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
On 1/13/2015 22:07, Magnus Manske wrote:
To spam this list as well as Twitter :-)
http://magnusmanske.de/wordpress/?p=250
On Tue, Jan 13, 2015 at 5:41 PM, Maximilian Klein <isalix@gmail.com mailto:isalix@gmail.com> wrote:
Thank you all for the feedback. I will have taken away quite a few good ideas for further investigation, to summarize: Gerard - look at the ratios of those bios of a language, which exist only in that language. Han Teng - "male gaze" hypothesis, create a by-profession crosstabular analysis. Jane - look at the ratios of leading actors by language, and "fictional humans" more closely. Jonathan - perform a filter step, or perhaps a weighting by page-views. Thanks so much for the advice, what a great list. Make a great day, Max Klein ‽ http://notconfusing.com/ On Mon, Jan 12, 2015 at 5:57 AM, WereSpielChequers <werespielchequers@gmail.com <mailto:werespielchequers@gmail.com>> wrote: I have spent quite a bit of time at new page patrol over the years. My suspicion is that many if not most of the people who create articles on newly signed pop stars and actors are from their management agency rather than fans, especially if they seem too early in their career to have fans. Sportspeople I suggest are more likely to be written about by fans, especially if they have been signed by a major team, or more importantly for Wikipedia a team with an actively editing fan. On this theory the quality of articles, the number of edits, and when we had the Article Feedback Tool the number of "is hot" type comments would be a good indication of interest from the volunteer editing community. But article creation is in part a matter of the policy of the relevant talent agencies. Sorry if that sounds overly cynical, perhaps if it were possible one would filter out the articles that get scarcely any views and then look at the gender balance of articles that are of interest to our audience as well as our editors. Regards Jonathan Cardy On 11 Jan 2015, at 22:23, h <hanteng@gmail.com <mailto:hanteng@gmail.com>> wrote:
Hello Piotr and Gerard, I think a competing hypothesis would be "male gaze". That is to say, the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research. Note that East Asian Wikipedians that I know, especially those who edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex. Check the following category pages as examples: (1a) Female actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E5%93%A1 (1b) Male actresses of every countries in the world http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E5%91%98 (2a) Female Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A5%B3%E5%84%AA (2b) Male Japanese AV (i.e. porn) actresses http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%94%B7%E5%84%AA It is quiet clear that the male gaze hypothesis seems to apply here. More female presentation simply because they are there to be consumed by men or boys. So one of my suggestions for research is to select a few professional categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis. Thus, I will be extremely cautious against using the current metrics/methods as viable "gender inequality index". As a proponent of "data normalization" and "geographic normalization" method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies. BTW, the methods you develop to extract gender from biography articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one. Best, han-teng liao 2015-01-11 19:01 GMT+02:00 Gerard Meijssen <gerard.meijssen@gmail.com <mailto:gerard.meijssen@gmail.com>>: Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles. I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture. Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM On 10 January 2015 at 11:51, Piotr Konieczny <piokon@post.pl <mailto:piokon@post.pl>> wrote: Here (http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-inequality-index/) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what? -- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi Piotr,
I think this may be due to gaps in the underlying data.
From memory, Wikidata initially showed higher proportions of women
than was expected from what we knew about Wikipedia, because the existing category structure (eg "Female X" but not "Male X") meant that more of the "obvious" cases, which could be inferred from categories, were women. This then dropped down a bit as more of the "less obvious" data was added.
*If* this is generally true, and I believe it is, then projects with a high proportion of biographies which do not have gender set will be more likely to have inaccurate gender statistics - because the obvious cases get added first, but these are not generally representative.
Very cursory check, looking for all P31:Q5 with sitelinks to given projects:
* zhwiki has 132k known people, of which 56k do not have P21 set. * jawiki has 252k known people, of which 79k do not have P21 set * frwiki has 424k known people, of which just 94(!) do not have P21 set * eswiki has 248k known people, of which 76 do not have P21 set
So 42% of biographies on the Chinese Wikipedia, and 31% on Japanese, do not have known wikidata gender. Meanwhile, the proportions without gender on Spanish and French are so low as to be negligible.
These numbers also indicate a very large proportion of biographies on these projects don't overlap with Western language coverage (or Wikidata would have picked up gender from those). Given what we already know about how articles are distributed across projects, it seems reasonable to assume they mostly match to people from within the relevant language's cultural group - and there, perhaps, is your explanation...
Regards,
Andrew.
On 16 January 2015 at 11:49, Piotr Konieczny piokon@post.pl wrote:
We are very happy to hear that your findings corroborate ours. What's your take on what we think is rather surprising high percentage of female bios for the South Asian/Confucian clusters? (i.e. non-Islamic Asia)?
--
Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
On 1/13/2015 22:07, Magnus Manske wrote:
To spam this list as well as Twitter :-)
http://magnusmanske.de/wordpress/?p=250
On Tue, Jan 13, 2015 at 5:41 PM, Maximilian Klein isalix@gmail.com wrote:
Thank you all for the feedback. I will have taken away quite a few good ideas for further investigation, to summarize:
Gerard - look at the ratios of those bios of a language, which exist only in that language. Han Teng - "male gaze" hypothesis, create a by-profession crosstabular analysis. Jane - look at the ratios of leading actors by language, and "fictional humans" more closely. Jonathan - perform a filter step, or perhaps a weighting by page-views.
Thanks so much for the advice, what a great list.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Mon, Jan 12, 2015 at 5:57 AM, WereSpielChequers werespielchequers@gmail.com wrote:
I have spent quite a bit of time at new page patrol over the years. My suspicion is that many if not most of the people who create articles on newly signed pop stars and actors are from their management agency rather than fans, especially if they seem too early in their career to have fans. Sportspeople I suggest are more likely to be written about by fans, especially if they have been signed by a major team, or more importantly for Wikipedia a team with an actively editing fan.
On this theory the quality of articles, the number of edits, and when we had the Article Feedback Tool the number of "is hot" type comments would be a good indication of interest from the volunteer editing community. But article creation is in part a matter of the policy of the relevant talent agencies.
Sorry if that sounds overly cynical, perhaps if it were possible one would filter out the articles that get scarcely any views and then look at the gender balance of articles that are of interest to our audience as well as our editors.
Regards
Jonathan Cardy
On 11 Jan 2015, at 22:23, h hanteng@gmail.com wrote:
Hello Piotr and Gerard,
I think a competing hypothesis would be "male gaze". That is to say,
the more female representation is not about a culture (defined as national, ethnic, linguistic or regional, not macho/feminine), but rather a gender-interest bias. Thus the more female representation could mean more male dominant culture, which is against the theoretical assumption of Piotr's research.
Note that East Asian Wikipedians that I know, especially those who
edit Chinese Wikipedia, are predominantly very young. Some of them can be highly interested in opposite sex.
Check the following category pages as examples:
(1a) Female actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9C%8B%E5%A5%B3%E6%BC%94%E... (1b) Male actresses of every countries in the world
http://zh.wikipedia.org/wiki/Category:%E5%90%84%E5%9B%BD%E7%94%B7%E6%BC%94%E...
(2a) Female Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E5%A... (2b) Male Japanese AV (i.e. porn) actresses
http://zh.wikipedia.org/w/index.php?title=Category:%E6%97%A5%E6%9C%ACAV%E7%9...
It is quiet clear that the male gaze hypothesis seems to apply here.
More female presentation simply because they are there to be consumed by men or boys.
So one of my suggestions for research is to select a few professional
categories that are of interest (say, politicians, poets, entertainers, etc.) to do some cross-tab analysis.
Thus, I will be extremely cautious against using the current
metrics/methods as viable "gender inequality index".
As a proponent of "data normalization" and "geographic normalization"
method myself, I would distinguish two sets of comparisons: one is cross-country or cross-language version absolute value comparison, another is cross-country or cross-language version "normalized" value comparison. By geographic normalization, I mean that researchers must gather another set of cross-country or cross-language datasets that captures some aspects of realities "external" to Wikipedia. In this case, I would say the Wikipedia represented politicians' gender ratio against the offline gender ratio of politicians. In other words, "data normalization" allows researchers to compare which language version are more or less (and how much) equal than the corresponding offline societies.
BTW, the methods you develop to extract gender from biography
articles for large-scale analysis may also be re-purpose to study other dimensions. One dimension that will interest me would be nationality. It will be interesting to see the coverage, focus or bias of a language version on people based on nationalities. Age might be another one.
Best, han-teng liao
2015-01-11 19:01 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Having read it, I find it is still very much a Wikipedia oriented.It makes use of the toolset by Markus. That is fine. the notion of diversity and notability is also very much culturally defined. It would be nice to know how the different wikipedias accept notability of people from other cultures and if it impacts the diversity of their own articles.
I have found that many people do not have an article in the languages of their own cultures. Often it has to do with an interest in a domain that is more of relevance to the other culture.
Diversity is very much part of a domain; in Roman Catholicism male dominance is obvious. I am curious if diversity in gender is affected by such considerations and if items with a single article are more in line with what is the norm for a culture, a domain. Thanks, GerardM
On 10 January 2015 at 11:51, Piotr Konieczny piokon@post.pl wrote:
Here (http://notconfusing.com/preliminary-results-from-wigi-the-wikipedia-gender-i...) are some early findings from a research project I am involved in (together with Maximilian Klein). (To find out more about the project, see https://meta.wikimedia.org/wiki/Research:Wikipedia_Gender_Inequality_Index and it's talk page). We are very curious what you think (don't hesitate to be critical). What we would really appreciate would be any alternative hypotheses (to the one presented) that could try to explain why post-1950s Confucian and South Asian clusters seem so much more inclusive of female biographies than others (including the "Western" clusters). Are we seeing a data error, or something else - and if so, what?
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org