Actually I think it would be useful to measure all existing female bios vs all existing male bios for the proportion of those which have been previously deleted and recreated. I have a theory that it is much more difficult to create bios of females in whatever category due to the systemic academic bias aginst including women's biographies in the list of "reliable sources" mostly used in Wikipedia. I would be especially interested in comparison of male-female ration of bios in established dictionaries of biography and how these compare to Wikipedia, and of those, how many such bios were previously deleted on Wikipedia and recreated.

On Mon, Apr 13, 2015 at 7:02 PM, WereSpielChequers <werespielchequers@gmail.com> wrote:
Hi Joseph,

That would be fine for established articles, but in my experience most new bios that get speedy deleted within a day or two of creation don't ever get an infobox added.



Regards

Jonathan Cardy


> On 13 Apr 2015, at 13:56, Joseph Reagle <joseph.2011@reagle.org> wrote:
>
>> On 04/12/2015 03:38 PM, Emilio J. Rodríguez-Posada wrote:
>> 2015-04-12 21:18 GMT+02:00 WereSpielChequers
>> <werespielchequers@gmail.com <mailto:werespielchequers@gmail.com>>:
>>
>>    Firstly looking at gender ratios of deleted and undeleted bios to
>>    see if there is an overall gender skew.
>>
>> I share here this page of deleted and recreated pages
>> https://en.wikipedia.org/wiki/User:Emijrp/Deletionism/2011 just in case
>> someone wants to explore that.
>
> I have python code that's pretty good at guessing the gender of
> biographical subjects, but originally it scraped HTML given a list of
> names. If someone had some code for retrieving the wikitext and
> determining that it is a biography (neither of which would be hard) it
> would be very easy to determine. Here's some pseudocode:
>
> ```
> #!/usr/bin/python2.7
>
> def is_bio(article):
>    '''TRUE if article is not '{{hsdis}}' and has '{{infobox person}}'''
>
> for title in titles:
>    males = females = unknowns = 0
>    if title_exists:
>        article = get_wiki(title)
>        if is_bio(article):
>            gender = guess_gender(article)
>            print('%s: %s' %(title, gender))
>            if gender = male:
>                males += 1
>            else gender = female
>                females += 1
>            else:
>                unknowns += 1
> print('males = %s; females = %s; unknowns = %s'
>    %(males, females , unknowns))
> ```
>
>
> _______________________________________________
> Gendergap mailing list
> Gendergap@lists.wikimedia.org
> To manage your subscription preferences, including unsubscribing, please visit:
> https://lists.wikimedia.org/mailman/listinfo/gendergap

_______________________________________________
Gendergap mailing list
Gendergap@lists.wikimedia.org
To manage your subscription preferences, including unsubscribing, please visit:
https://lists.wikimedia.org/mailman/listinfo/gendergap