Hi Markus and Tom,
Markus, I understand that your report does not claim to be research. I actually happen to think it's a really fantastic and mind-expanding idea of what we can do with Wikidata. It's given me much food for thought, not only can we correlate Wikidata fields, but we can use that data as predictors. That's a step further than I ever imagined to go. What I wrote, and am writing, is not intended to be a criticism of your efforts.
Tom, the link you are citing is my own paper, and the data I reported is purely "positive" and "descriptive" not only Wikipedia, but also the subset of Wikipedia data that has been migrated to Wikidata. Which is to say that it contains all the biases of those processes. That is not necessarily bad - but what might be is the reasoning that stems from interpreting these results. For instance, depending on how you define sexually ambiguous humans, between 0.1% and 1.7% of humans could be classified as such. [1] So conservatively Wikidata is two orders of magnitude off, and could be three orders of magnitude off. And then all of a sudden were interpreting our own bias as truth, and possibly just simplifying people out of existence.
I'm not saying anyone has done anything wrong. I just feel abstractly concerned - that's nobody's fault in particular. Somehow Wikidata has given us the power to greater quantify our view of the world, and our bias is really becoming clear - numerically. Then we think about things like make a bot to give people properties based on strings, and placing value constraints on the sex property. Who is that helping? The software is beautifully built so it doesn't force us to do any of this. I would argue it's our inherited worldviews that is guiding us.
Sorry to rant. These are my feelings, they do not require a response. I do not demand or request that anybody feel the same way.
[1] https://en.wikipedia.org/wiki/Intersex#Prevalence. (Citing the underlying citations of https://www.worldcat.org/search?qt=wikipedia&q=isbn%3A0465077137)
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
________________________________ From: wikidata-l-bounces@lists.wikimedia.org wikidata-l-bounces@lists.wikimedia.org on behalf of Tom Morris tfmorris@gmail.com Sent: Tuesday, October 15, 2013 9:14 AM To: Discussion list for the Wikidata project. Subject: Re: [Wikidata-l] Application: sexing people by name/research gender bias
On Tue, Oct 15, 2013 at 7:50 AM, Markus Kr?tzsch <markus@semantic-mediawiki.orgmailto:markus@semantic-mediawiki.org> wrote: My error margins are far too wide to make any realistic statement about "minority genders" even if I had a method to consider them.
This article: http://journal.code4lib.org/articles/8964 gives them as being in the range 0.002% - 0.006% so they're unlikely to effect any real-world analysis.
Tom