On 14/10/13 18:18, Tom Morris wrote:
Naming patterns change over time and geography. If you're interested in the gender of current day authors, you should probably constrain your name sampling to the same timeframe.
I think geography has a much bigger impact than time here. Unfortunately, the names I try to find the sex for do not come with an obvious hint on their geographic origin, so I cannot really use this. I think filtering by time will not have a big impact, since most people on Wikipedia are from the 20th century anyway. So there should be a natural tendency to overrule older uses of names.
There's an app that works of the Freebase data here: http://namegender.freebaseapps.com/
It also has an API that returns JSON: http://namegender.freebaseapps.com/gender_api?name=andrea
Based on the top name stats, it looks like its sample is a little more than twice the size of Wikidata's.
Nice. Christian Thiele also pointed me to a beautiful web service based on Wikipedia Personendaten (German language, but many things are easy to figure out, I guess):
http://toolserver.org/~apper/pd/vorname/top http://toolserver.org/~apper/pd/vorname/Maria
This illustrates nicely how to take the effect of time into account.
Markus
On Sun, Oct 13, 2013 at 6:16 PM, Markus Krötzsch <markus@semantic-mediawiki.org mailto:markus@semantic-mediawiki.org> wrote:
Hi all, I'd like to share a little Wikidata application: I just used Wikidata to guess the sex of people based on their (first) name [1]. My goal was to determine gender bias among the authors in several research areas. This is how some people spend their free time on weekends ;-) In the process, I also created a long list of first names with associated sex information from Wikidata [2]. It is not super clean but it served its purpose. If you are a researcher, then maybe the gender bias of journals/conferences is interesting to you as well. Details and some discussion of the results are online [1]. Cheers, Markus [1] http://korrekt.org/page/Note:__Sex_Distributions_in_Research <http://korrekt.org/page/Note:Sex_Distributions_in_Research> [2] https://docs.google.com/__spreadsheet/ccc?key=0AstQ5xfO-__xXGdE9UVkxNc0JMVWJzNmJqNmhPRjc__0cnc&usp=sharing <https://docs.google.com/spreadsheet/ccc?key=0AstQ5xfO-xXGdE9UVkxNc0JMVWJzNmJqNmhPRjc0cnc&usp=sharing> _________________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/__mailman/listinfo/wikidata-l <https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l