If we want more domain-specific wikidata curators we absolutely have to improve the flow of:
(1) viewing an article on Wikipedia
(2) discovering the associated item on wikidata
(3) making useful contributions to the item and the items surrounding it in the graph

That little link on the side of every article in Wikipedia is literally invaluable... and is the main thing that distinguishes wikidata from freebase (IMHO).  The (large) technical differences pale in comparison.  I know that people are already working on that flow, but I think its worth emphasizing here as we consider the requirements for scaling up community as we scale up data.  

2 cents..

On Mon, Sep 28, 2015 at 4:12 PM, John Erling Blad <jeblad@gmail.com> wrote:
I would like to add old URLs that seems to be a source but does not reference anything in the claim. For example in an item about a person, the name or the birth date of the person does not appear on the page still the page is used as a source for the persons birth date.

On Mon, Sep 28, 2015 at 11:44 PM, Stas Malyshev <smalyshev@wikimedia.org> wrote:

> I see that 19.6k statements have been approved through the tool, and
> 5.1k statements have been rejected - which means that about 1 in 5
> statements is deemed unsuitable by the users of primary sources.

From my (limited) experience with Primary Sources, there are several
kinds of things there that I had rejected:

- Unsourced statements that contradict what is written in Wikidata
- Duplicate claims already existing in Wikidata
- Duplicate claims with worse data (i.e. less accurate location, less
specific categorization, etc) or unnecessary qualifiers (such as adding
information which is already contained in the item to item's qualifiers
- e.g. zip code for a building)
- Source references that do not exist (404, etc.)
- Source references that do exist but either duplicate existing one (a
number of sources just refer to different URL of the same data) or do
not contain the information they should (e.g. link to newspaper's
homepage instead of specific article)
- Claims that are almost obviously invalid (e.g. "United Kingdom" as a
genre of a play)

I think at least some of these - esp. references that do not exist and
duplicates with no refs - could be removed automatically, thus raising
the relative quality of the remaining items.

OTOH, some of the entries can be made self-evident - i.e. if we talk
about movie and Freebase has IMDB ID or Netflix ID, it may be quite easy
to check if that ID is valid and refers to a movie by the same name,
which should be enough to merge it.

Not sure if those one-off things worth bothering with, just putting it
out there to consider.

Stas Malyshev

Wikidata mailing list

Wikidata mailing list