Il 20/03/2015 01:11, Amir Ladsgroup ha scritto:
OK, I have some news:
1- Today I rewrote some parts of Kian and now it automatically chooses regulation parameter (lambda), thus predictions are more accurate. I wanted to push changes to the github but It seems my ssh has issues. It'll be there soon
2- (Important) I wrote a code that can find possible mistakes in Wikidata based on Kian. The code will be in github soon. Check out this link. It's result from comparing French Wikipedia against Wikidata e.g. this line:
Q2994923: 1 (d), 0.257480420229 (w) [0, 0, 1, 2, 0]
1 (d) means Wikidata thinks it's a human

0.25... (w) means French Wikipedia thinks it's not a human (with 74.3% certainty)

And if you check the link you can see it's a mistake in Wikidata. Please check other results and fix them.

Tell me if you want this test to be ran from another language too.

3- I used Kian to import unconnected pages from French Wikipedia and created about 1900 items. The result is here and please check if anything in this list is not human and tell me and I run some error analysis.


Best
The data are based on dumps, aren't they? Wikidata hasn't been thinking Q73823 is a human since 21 Feb.