I have pulled together the following table together for the past 360 days, counting whenever an image was reverted by someone who was not the last uploader, and then attempting to find any declared gender:
2014-2015 Commons file overwrite stats compared to gender
+---------------+----------+ | sex | count(*) | +---------------+----------+ | female-female | 1 | | female-male | 110 | | female-none | 426 | | male-female | 139 | | male-male | 1376 | | male-none | 5711 | | none-female | 479 | | none-male | 5289 | | none-none | 15716 | +---------------+----------+
Key: "none" means not set in user preferences, "female-male" means a woman has overwritten a man's file and "male-none" means a declared male has overwritten an account with no gender set.
I'd appreciate any views on whether there is any statistical meaning to be pulled from these figures, apart from showing that men probably outnumber women contributors by ten times on Commons.
If the email is displaying badly, you can find a wiki formatted table and original generating SQL on the Commons village pump[1]. I thought this would be of wider interest as though "image revert warring" is mostly an issue for Wikimedia Commons, it is a very similar area of heated disputes when compared to edit revert warring on Wikipedia projects. The question popped up from someone interested in my long running 'significant reverts' tracking report.[2]
Links: 1. https://commons.wikimedia.org/wiki/Commons:Village_pump#Does_openly_declarin... 2. https://commons.wikimedia.org/wiki/User:Fae/SignificantReverts
Fae
2015-08-14 2:20 GMT+02:00 Fæ faewik@gmail.com:
15716
First of all it seems that vast majority of Commons users do not select their gender so they are "none". It obviously spoils the rest of the statistics. Would be good to add to it numbers of males, females and "nones" included, so it would be more clear which group has generally stronger tendency for overwriting. For example strong "overwriter" can make a 1000 overwrites a year, and a weak one just 1. Such "strong" overwrites can also spoil this statistics. For example - one "strong" female overwrite can easily make all overwrites and the rest of females are not overwriting at all :-) The same apply the other way - i.e. we can say that women are more vulnerable to be overwriten if you divide numbers of overwrites by number of females and compare it to the other groups.
Looking at "male-female" and "female-male", and considering the much-cited 15% female editor ratio, it seems women are much more overwrite-happy than men.
Then again, 139:110 are not exactly numbers one does want to base statistics on...
On Fri, Aug 14, 2015 at 9:25 AM Tomasz Ganicz polimerek@gmail.com wrote:
2015-08-14 2:20 GMT+02:00 Fæ faewik@gmail.com:
15716
First of all it seems that vast majority of Commons users do not select their gender so they are "none". It obviously spoils the rest of the statistics. Would be good to add to it numbers of males, females and "nones" included, so it would be more clear which group has generally stronger tendency for overwriting. For example strong "overwriter" can make a 1000 overwrites a year, and a weak one just 1. Such "strong" overwrites can also spoil this statistics. For example - one "strong" female overwrite can easily make all overwrites and the rest of females are not overwriting at all :-) The same apply the other way - i.e. we can say that women are more vulnerable to be overwriten if you divide numbers of overwrites by number of females and compare it to the other groups.
-- Tomek "Polimerek" Ganicz http://pl.wikimedia.org/wiki/User:Polimerek http://www.ganicz.pl/poli/ http://www.cbmm.lodz.pl/work.php?id=29&title=tomasz-ganicz _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org https://meta.wikimedia.org/wiki/Mailing_lists/GuidelinesWikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Well given the case of women overwriting photos of themselves, I would totally agree with you. In my (admittedly few) conversations with Wikipedia article subjects, women object most to the photos of themselves, while men talk about specific lines of text.
On Fri, Aug 14, 2015 at 10:58 AM, Magnus Manske <magnusmanske@googlemail.com
wrote:
Looking at "male-female" and "female-male", and considering the much-cited 15% female editor ratio, it seems women are much more overwrite-happy than men.
Then again, 139:110 are not exactly numbers one does want to base statistics on...
On Fri, Aug 14, 2015 at 9:25 AM Tomasz Ganicz polimerek@gmail.com wrote:
2015-08-14 2:20 GMT+02:00 Fæ faewik@gmail.com:
15716
First of all it seems that vast majority of Commons users do not select their gender so they are "none". It obviously spoils the rest of the statistics. Would be good to add to it numbers of males, females and "nones" included, so it would be more clear which group has generally stronger tendency for overwriting. For example strong "overwriter" can make a 1000 overwrites a year, and a weak one just 1. Such "strong" overwrites can also spoil this statistics. For example - one "strong" female overwrite can easily make all overwrites and the rest of females
are
not overwriting at all :-) The same apply the other way - i.e. we can say that women are more vulnerable to be overwriten if you divide numbers of overwrites by number of females and compare it to the other groups.
-- Tomek "Polimerek" Ganicz http://pl.wikimedia.org/wiki/User:Polimerek http://www.ganicz.pl/poli/ http://www.cbmm.lodz.pl/work.php?id=29&title=tomasz-ganicz _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org <
https://meta.wikimedia.org/wiki/Mailing_lists/GuidelinesWikimedia-l@lists.wi...
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On 14 August 2015 at 09:58, Magnus Manske magnusmanske@googlemail.com wrote:
Looking at "male-female" and "female-male", and considering the much-cited 15% female editor ratio, it seems women are much more overwrite-happy than men.
Then again, 139:110 are not exactly numbers one does want to base statistics on...
I agree that intuitively the numbers look too marginal to draw strong conclusions, apart perhaps from deducing that though the proportion of women who are active on Wikimedia Commons remains far too low, a user's gender does not have any significant affect on whether they get overwritten or overwrite others. I wanted to raise it as statistical interpretation is not my forte, and there are a number of researchers who follow this list who are darn hot on statistics or might draw comparisons with existing gender related measurements :-)
With respect to being *overwrite-happy*, it is worth re-enforcing that healthy collaborative work on Wikimedia Commons does include files that have a lot of planned overwrites and only in a minority of cases is significant overwriting a symptom of a dispute. For example the maps of the USA which track the status of rapidly changing same-sex marriage legislation are some of the most overwritten files on Commons, having hundreds of overwrites, and all of the changes are positive improvement which help to illustrate some popular LGBT articles around this area of newsworthy law and politics.
Cheers, Fae
On 14 August 2015 at 11:33, Fæ faewik@gmail.com wrote:
On 14 August 2015 at 09:58, Magnus Manske magnusmanske@googlemail.com
wrote:
Looking at "male-female" and "female-male", and considering the
much-cited
15% female editor ratio, it seems women are much more overwrite-happy
than
men.
I'm just thinking this through again, as there is a logical flaw in statement. If we take a random sample space limited to just overwrites where men are overwriting women or women are overwriting men, then the ratio of men:women is irrelevant, given a large sample. Effectively even if men outnumber women by 80% or 90%, the numbers of overwrites of type "male-female" and "female-male" should be almost the same in /both directions/. If the figures are not similar, then other factors are at play than just that there are more men contributing to the project.
Testing this theory, I went to the English Wikipedia database and checking over all time, found:
+-------------+----------+ | sex | count(*) | +-------------+----------+ | female-male | 63 | | male-female | 127 | +-------------+----------+
Checking the all time figures for Commons shows:
+-------------+----------+ | sex | count(*) | +-------------+----------+ | female-male | 1309 | | male-female | 2220 | +-------------+----------+
Quickly going to French (far less statistically significant):
+-------------+----------+ | sex | count(*) | +-------------+----------+ | female-male | 1 | | male-female | 12 | +-------------+----------+
German:
+-------------+----------+ | sex | count(*) | +-------------+----------+ | female-male | 6 | | male-female | 39 | +-------------+----------+
The conclusion has to be that women are at least /twice/ as likely to have an image overwritten by a man on our projects than the reverse happing. The numbers are sufficiently large for it to appear a meaningful result.
Fae
especially when the overwrites are non-contentious (most of my overwrites are to provide better quality or higher resolution for photos of paintings)
On Fri, Aug 14, 2015 at 10:25 AM, Tomasz Ganicz polimerek@gmail.com wrote:
2015-08-14 2:20 GMT+02:00 Fæ faewik@gmail.com:
15716
First of all it seems that vast majority of Commons users do not select their gender so they are "none". It obviously spoils the rest of the statistics. Would be good to add to it numbers of males, females and "nones" included, so it would be more clear which group has generally stronger tendency for overwriting. For example strong "overwriter" can make a 1000 overwrites a year, and a weak one just 1. Such "strong" overwrites can also spoil this statistics. For example - one "strong" female overwrite can easily make all overwrites and the rest of females are not overwriting at all :-) The same apply the other way - i.e. we can say that women are more vulnerable to be overwriten if you divide numbers of overwrites by number of females and compare it to the other groups.
-- Tomek "Polimerek" Ganicz http://pl.wikimedia.org/wiki/User:Polimerek http://www.ganicz.pl/poli/ http://www.cbmm.lodz.pl/work.php?id=29&title=tomasz-ganicz _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Il giorno ven, 14/08/2015 alle 01.20 +0100, Fæ ha scritto:
I'd appreciate any views on whether there is any statistical meaning to be pulled from these figures, apart from showing that men probably outnumber women contributors by ten times on Commons.
To have a real answer you should do a statistical test - after addressing the considerations written by Tomasz. However, my rough guess is that the answer is no: if you sum up the rows and the columns, the proportions of declared male/female/none among reverters and reverted are similar. I've put the numbers on the Commons page [1].
Laurentius
[1]https://commons.wikimedia.org/wiki/Commons:Village_pump#Does_openly_declarin...
To summarize the numbers a bit:
Male uploads that were overwritten by anyone: 7226 Female uploads that were overwritten by anyone: 536 None uploads that were overwritten by anyone: 21484
Male uploads that were overwrote anyone: 6775 Female uploads that were overwrote anyone: 619 None uploads that were overwrote anyone: 21861
As a percentage of all overwritten uploads Male: 24.7% Female: 1.8% None: 73.5%
As a percentage of all uploads that overwrote someone Male: 23.2% Female: 2.1% None: 74.7%
Assuming random assortment, we would expect:
Male-male: 0.247*0.232 = 5.7% Male-female: 0.52% Male-none: 18.5% Female-male: 0.42% Female-female: 0.038% Female-none: 1.3% None-male: 17.1% None-female: 1.5% None-none: 54.9%
Given the number of events observed, that translates to expectations of:
Male-male = 1667 Male-female = 152 Male-none = 5412 Female-male = 122 Female-female = 11 Female-none = 380 None-male = 5003 None-female = 439 None-none = 16061
These numbers are broadly consistent with what you posted. There are some deviations, but given the sample size and the number of possible confounding factors (some of which have been mentioned by others) I think you'd need evidence of a quite large effect before assuming there was an important bias.
-Robert Rohde
On Fri, Aug 14, 2015 at 2:20 AM, Fæ faewik@gmail.com wrote:
I have pulled together the following table together for the past 360 days, counting whenever an image was reverted by someone who was not the last uploader, and then attempting to find any declared gender:
2014-2015 Commons file overwrite stats compared to gender
+---------------+----------+ | sex | count(*) | +---------------+----------+ | female-female | 1 | | female-male | 110 | | female-none | 426 | | male-female | 139 | | male-male | 1376 | | male-none | 5711 | | none-female | 479 | | none-male | 5289 | | none-none | 15716 | +---------------+----------+
Key: "none" means not set in user preferences, "female-male" means a woman has overwritten a man's file and "male-none" means a declared male has overwritten an account with no gender set.
I'd appreciate any views on whether there is any statistical meaning to be pulled from these figures, apart from showing that men probably outnumber women contributors by ten times on Commons.
If the email is displaying badly, you can find a wiki formatted table and original generating SQL on the Commons village pump[1]. I thought this would be of wider interest as though "image revert warring" is mostly an issue for Wikimedia Commons, it is a very similar area of heated disputes when compared to edit revert warring on Wikipedia projects. The question popped up from someone interested in my long running 'significant reverts' tracking report.[2]
Links:
https://commons.wikimedia.org/wiki/Commons:Village_pump#Does_openly_declarin... 2. https://commons.wikimedia.org/wiki/User:Fae/SignificantReverts
Fae
faewik@gmail.com https://commons.wikimedia.org/wiki/User:Fae _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
wikimedia-l@lists.wikimedia.org