Hey Tom, Thanks for you review. Note that this list list of *possible* errors and it doesn't mean all of entries are wrong :) (if it was like this, I would go ahead and removed them all)
On Mon, Aug 31, 2015 at 1:22 AM Tom Morris tfmorris@gmail.com wrote:
After glancing at https://www.wikidata.org/wiki/User:Ladsgroup/Kian/Possible_mistakes/frFilm, it doesn't appear to me that either Wikidata type hierarchy or Wikipedia category hierarchy is being considered when evaluating type mismatches. Is that intentional?
Not yet, it can be done with some programming hassle. If people like the
reports and they are willing to work on them I promise to take that into account.
For example
Grave of the Fireflies (Q274520) https://www.wikidata.org/wiki/Q274520NoYes (0.731427666987)
is an instance of animated film which is a subtype of film.
Conversely, this telefilm d'horreur
Le Collectionneur de cerveaux (Q579355) https://www.wikidata.org/wiki/Q579355YesNo (0.239868037957
is part of a subcategory of film d'horreur -> film de fiction
The one other that I glanced at, https://www.wikidata.org/wiki/User:Ladsgroup/Kian/Possible_mistakes/frHuman, seems to have systematic issues with correct classification of Wikipedia pages about multiple people (e.g. brothers) which Wikidata correctly identifies as not people.
It can be considered as mis-classifications in articles in Wikipedia. even though I'm not a big fan of this idea. It seems these articles in Wikipedia lack of proper categories like siblings and duo-related categories and if those categories were there Kian would know.
It also, strangely, seems to think that Wikidata atomic elements are humans and I can't see why:
calcium (Q706) https://www.wikidata.org/wiki/Q706YesNo (0.0225392419603)
That's a bug in autolist, I don't know why autolist included Q706 in
humans. Maybe Magnus can tell. I need to dig deeper
Have you considered using other signals as inputs to your models? For example, Freebase types should be a pretty reliable signal for things like humans and films.
No, but I think and investigate using them :)
Tom
On Sun, Aug 30, 2015 at 11:56 AM, Amir Ladsgroup ladsgroup@gmail.com wrote:
Thanks Nemo!
I added new reports: https://www.wikidata.org/wiki/User:Ladsgroup/Kian/Possible_mistakes
If you check them, you can easily find tons of errors, some of them are mis-categorization in Wikipedia, some of them are mistake in connecting article from Wikipedia to wrong item, some of them are vandalism in Wikidata, some of them are mistakes by bots or Widar users. Please check them if you want to have better quality in Wikidata
Best
On Sun, Aug 30, 2015 at 12:16 PM Federico Leva (Nemo) nemowiki@gmail.com wrote:
Amir Ladsgroup, 28/08/2015 20:17:
Another thing I did is reporting possible mistakes, when Wikipedia and Wikidata don't agree on one statement,
Nice, with this Wikidata has better quality control systems than Wikipedia. ;-)
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata