Please consider, it has been said all too often that Primary Sources is the
tool that should be used. Given that it has a bad UI and is not maintained;
what benefits does it hold?
Why do we throw away all the good work when we do not value it?
On 20 December 2016 at 15:29, Markus Kroetzsch <
now that SQID supports the confirmation/rejection of statements from
Primary Sources (Freebase imports), I notice certain systematic issues with
it. I believe many of the proposals should be removed because they are
already represented in Wikidata and do not need to be imported.
Three types of data I found so far:
(1) Redundant "located in the administrative territorial
administrative territorial entity". Wikidata stores only the next territory
above/below the current one in these relations. PS often suggests
territories reachable through several steps instead.
(login first to see
suggestions). There are almost 100 towns that fall into this area suggested
here, but they all should be organised in more specific sub-regions of the
There is a higher-level
territory suggested here (Bavaria) even though "Lower Bavaria" is already
Similar things are found, e.g., for occupation (P106), where a person that
is already a "sport cyclist" might be suggested to be a
(2) Syntactic variations of the "same" value. Typical cases are URLs,
which PS suggests with trailing "/" even after top-level domains, while
Wikidata often omits it. This means you have suggestions like "
when there is already "http://www.pirna.de".
(3) Redirect items as values. PS sometimes suggests statement values that
are redirects to other entities, for which there already is a statement.
All of these cases should be fixed on the provider side, not by hiding
suggestions in the UI (as it seems to be done by the PS gadget for case
(2)). This would also help to get better statistics: right now, all I can
do is to reject all of these values, but this might be misleading if one
looks at the PS statistics since they are not wrong, but simply unnecessary.
Simply hiding suggestions that are not eliminated from the data also makes
the PS service's feature for finding items with suggestions much less
useful (you might find items that does not show you any suggestion).
I was wondering if anybody is still working on PS clean up now or if this
part of the project this orphaned.
Prof. Dr. Markus Kroetzsch
Knowledge-Based Systems Group
Center for Advancing Electronics Dresden (cfaed)
Faculty of Computer Science
+49 351 463 38486
Wikidata mailing list