On Sun, Apr 20, 2014 at 6:14 AM, Stuart A. Yeates syeates@gmail.com wrote:
On 20/04/2014 11:05 AM, "Gerard Meijssen" gerard.meijssen@gmail.com wrote:
What I do know is that at Wikidata we harvest information from all Wikipedias. It does include en,wp but it is not exclusively so. It does include the Russian, the Chinese, the Arabic ... all Wikipedias. As you know, the first operational task for Wikidata is to replace the old inter language links. A next objective is to include all the information that is currently held in info boxes.
What process does wikidata have when different wikis have different policies about what should appear by default in infoboxes. In particular when a policy calls for discretion or human judgement?
It doesnt have good processes or good policies, and does have a lot of bots automatically importing data from every wiki.
And this causes the problem you are concerned about Stuart.
here is a sample query of transgender/transsexual people in Wikidata.
http://tools.wmflabs.org/wikidata-todo/autolist.html?props=P21&cat_name=...
They should either make no claims about sex and gender, or have a 'sex/gender' (property 21, or P21) that includes 'transgender' (e.g. Q1052281 = transgender woman), or English Wikipedia is wrong...
https://www.wikidata.org/wiki/Q1052281
It is trivial to find examples where the only P21 claim is female (Q6581072) or male (Q6581097).
e.g. Buck Angel
BeneBot* adds 'male animal'
https://www.wikidata.org/w/index.php?title=Q958281&diff=9027854&oldi...
legobot changes it to '[human] male'
https://www.wikidata.org/w/index.php?title=Q958281&diff=14944633&old...
a non-bot (now blocked on Korean Wikipedia) tried to change this to 'hefemale', and was reverted by Sk!d 12 hours later.
https://www.wikidata.org/w/index.php?title=Q958281&diff=49577100&old... https://www.wikidata.org/w/index.php?title=Q958281&diff=49664043&old...
Obviously 'hefemale' is not the best term, but it should have been corrected to be the more appropriate and more precise trans man.
Here are other bots (SamoaBot & VIAFbot) importing the wrong value from various datasets.
https://www.wikidata.org/w/index.php?title=Q5144952&diff=50988273&ol... https://www.wikidata.org/w/index.php?title=Q4709895&diff=52762663&ol...
Here is a human contributor doing it using Widar (semi-automated tool)
https://www.wikidata.org/w/index.php?title=Q6118283&diff=115936683&o...
These errors are typically all over a year old, without being corrected.