Hi Joseph,
That would be fine for established articles, but in my experience most new
bios that get speedy deleted within a day or two of creation don't ever get
an infobox added.
Regards
Jonathan Cardy
On 13 Apr 2015, at 13:56, Joseph Reagle
<joseph.2011(a)reagle.org> wrote:
On 04/12/2015 03:38 PM, Emilio J.
Rodríguez-Posada wrote:
2015-04-12 21:18 GMT+02:00 WereSpielChequers
<werespielchequers(a)gmail.com <mailto:werespielchequers@gmail.com>>:
Firstly looking at gender ratios of deleted and undeleted bios to
see if there is an overall gender skew.
I share here this page of deleted and recreated pages
https://en.wikipedia.org/wiki/User:Emijrp/Deletionism/2011 just in case
someone wants to explore that.
I have python code that's pretty good at guessing the gender of
biographical subjects, but originally it scraped HTML given a list of
names. If someone had some code for retrieving the wikitext and
determining that it is a biography (neither of which would be hard) it
would be very easy to determine. Here's some pseudocode:
```
#!/usr/bin/python2.7
def is_bio(article):
'''TRUE if article is not '{{hsdis}}' and has '{{infobox
person}}'''
for title in titles:
males = females = unknowns = 0
if title_exists:
article = get_wiki(title)
if is_bio(article):
gender = guess_gender(article)
print('%s: %s' %(title, gender))
if gender = male:
males += 1
else gender = female
females += 1
else:
unknowns += 1
print('males = %s; females = %s; unknowns = %s'
%(males, females , unknowns))
```
_______________________________________________
Gendergap mailing list
Gendergap(a)lists.wikimedia.org
To manage your subscription preferences, including unsubscribing, please
_______________________________________________
Gendergap mailing list
Gendergap(a)lists.wikimedia.org
To manage your subscription preferences, including unsubscribing, please
visit: