Naming patterns change over time and geography.  If you're interested in the gender of current day authors, you should probably constrain your name sampling to the same timeframe.

There's an app that works of the Freebase data here:
http://namegender.freebaseapps.com/

It also has an API that returns JSON:
http://namegender.freebaseapps.com/gender_api?name=andrea

Based on the top name stats, it looks like its sample is a little more than twice the size of Wikidata's.

Tom

On Sun, Oct 13, 2013 at 6:16 PM, Markus Krötzsch <markus@semantic-mediawiki.org> wrote:
Hi all,

I'd like to share a little Wikidata application: I just used Wikidata to guess the sex of people based on their (first) name [1]. My goal was to determine gender bias among the authors in several research areas. This is how some people spend their free time on weekends ;-)

In the process, I also created a long list of first names with associated sex information from Wikidata [2]. It is not super clean but it served its purpose. If you are a researcher, then maybe the gender bias of journals/conferences is interesting to you as well. Details and some discussion of the results are online [1].

Cheers,

Markus

[1] http://korrekt.org/page/Note:Sex_Distributions_in_Research
[2] https://docs.google.com/spreadsheet/ccc?key=0AstQ5xfO-xXGdE9UVkxNc0JMVWJzNmJqNmhPRjc0cnc&usp=sharing

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l