On 14/10/13 18:18, Tom Morris wrote:
Naming patterns change over time and geography. If
you're interested in
the gender of current day authors, you should probably constrain your
name sampling to the same timeframe.
I think geography has a much bigger impact than time here.
Unfortunately, the names I try to find the sex for do not come with an
obvious hint on their geographic origin, so I cannot really use this. I
think filtering by time will not have a big impact, since most people on
Wikipedia are from the 20th century anyway. So there should be a natural
tendency to overrule older uses of names.
There's an app that works of the Freebase data here:
http://namegender.freebaseapps.com/
It also has an API that returns JSON:
http://namegender.freebaseapps.com/gender_api?name=andrea
Based on the top name stats, it looks like its sample is a little more
than twice the size of Wikidata's.
Nice. Christian Thiele also pointed me to a beautiful web service based
on Wikipedia Personendaten (German language, but many things are easy to
figure out, I guess):
http://toolserver.org/~apper/pd/vorname/top
http://toolserver.org/~apper/pd/vorname/Maria
This illustrates nicely how to take the effect of time into account.
Markus
On Sun, Oct 13, 2013 at 6:16 PM, Markus Krötzsch
<markus(a)semantic-mediawiki.org <mailto:markus@semantic-mediawiki.org>>
wrote:
Hi all,
I'd like to share a little Wikidata application: I just used
Wikidata to guess the sex of people based on their (first) name [1].
My goal was to determine gender bias among the authors in several
research areas. This is how some people spend their free time on
weekends ;-)
In the process, I also created a long list of first names with
associated sex information from Wikidata [2]. It is not super clean
but it served its purpose. If you are a researcher, then maybe the
gender bias of journals/conferences is interesting to you as well.
Details and some discussion of the results are online [1].
Cheers,
Markus
[1]
http://korrekt.org/page/Note:__Sex_Distributions_in_Research
<http://korrekt.org/page/Note:Sex_Distributions_in_Research>
[2]
https://docs.google.com/__spreadsheet/ccc?key=0AstQ5xfO-__xXGdE9UVkxNc0JMVW…
<https://docs.google.com/spreadsheet/ccc?key=0AstQ5xfO-xXGdE9UVkxNc0JMVWJzNmJqNmhPRjc0cnc&usp=sharing>
_________________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
<https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l