Hi Joseph,
That would be fine for established articles, but in my experience most new bios that get speedy deleted within a day or two of creation don't ever get an infobox added.
Regards
Jonathan Cardy
On 13 Apr 2015, at 13:56, Joseph Reagle joseph.2011@reagle.org wrote:
On 04/12/2015 03:38 PM, Emilio J. RodrÃguez-Posada wrote: 2015-04-12 21:18 GMT+02:00 WereSpielChequers <werespielchequers@gmail.com mailto:werespielchequers@gmail.com>:
Firstly looking at gender ratios of deleted and undeleted bios to see if there is an overall gender skew.
I share here this page of deleted and recreated pages https://en.wikipedia.org/wiki/User:Emijrp/Deletionism/2011 just in case someone wants to explore that.
I have python code that's pretty good at guessing the gender of biographical subjects, but originally it scraped HTML given a list of names. If someone had some code for retrieving the wikitext and determining that it is a biography (neither of which would be hard) it would be very easy to determine. Here's some pseudocode:
#!/usr/bin/python2.7 def is_bio(article): '''TRUE if article is not '{{hsdis}}' and has '{{infobox person}}''' for title in titles: males = females = unknowns = 0 if title_exists: article = get_wiki(title) if is_bio(article): gender = guess_gender(article) print('%s: %s' %(title, gender)) if gender = male: males += 1 else gender = female females += 1 else: unknowns += 1 print('males = %s; females = %s; unknowns = %s' %(males, females , unknowns))
Gendergap mailing list Gendergap@lists.wikimedia.org To manage your subscription preferences, including unsubscribing, please visit: https://lists.wikimedia.org/mailman/listinfo/gendergap