From experience with Wikidata data I often find that data is wrong or of questionable quality. Duh. Wikidata and Freebase are of a similar quality. What we know is that there is some data Freebase has we are not interested in.. For instance things that double the data by having it in two places.

We know that data can be easily improved by comparing it with other sources. This process is available for Wikidata but not for the primary sources tool as far as I am aware.

The problem with the primary sources tool is that it does not lead to imports in Wikidata and therefore it is one big miserable failure. To compound this issue, it is an article of faith that we "need" it and it is therefore not a subject that is talked about. Importing it one statement at a time is an absolute waste of time. It makes the user experience horrible.

On 25 September 2015 at 01:02, Markus Krötzsch <markus@semantic-mediawiki.org> wrote:
On 24.09.2015 23:48, James Heald wrote:
Has anybody actually done an assessment on Freebase and its reliability?

Is it *really* too unreliable to import wholesale?

>From experience with the Primary Sources tool proposals, the quality is mixed. Some things it proposes are really very valuable, but other things are also just wrong. I added a few very useful facts and fitting references based on the suggestions, but I also rejected others. Not sure what the success rate is for the cases I looked at, but my feeling is that some kind of "supervised import" approach is really needed when considering the total amount of facts.

An issue is that it is often fairly hard to tell if a suggestion is true or not (mainly in cases where no references are suggested to check). In other cases, I am just not sure if a fact is correct for the property used. For example, I recently ended up accepting "architect: Charles Husband" for Lovell Telescope (Q555130), but to be honest I am not sure that this is correct: he was the leading engineer contracted to design the telescope, which seems different from an architect; no official web site uses the word "architect" it seems; I could not find a better property though, and it seemed "good enough" to accept it (as opposed to the post code of the location of this structure, which apparently was just wrong).

Are there any stats/progress graphs as to how the actual import is in
fact going?

It would indeed be interesting to see which percentage of proposals are being approved (and stay in Wikidata after a while), and whether there is a pattern (100% approval on some type of fact that could then be merged more quickly; or very low approval on something else that would maybe better revisited for mapping errors or other systematic problems).


   -- James.

On 24/09/2015 19:35, Lydia Pintscher wrote:
On Thu, Sep 24, 2015 at 8:31 PM, Tom Morris <tfmorris@gmail.com> wrote:
This is to add MusicBrainz to the primary source tool, not anything

It's apparently worse than that (which I hadn't realized until I
re-read the
transcript).  It sounds like it's just going to generate little warning
icons for "bad" facts and not lead to the recording of any new facts
at all.

17:22:33 <Lydia_WMDE> we'll also work on getting the extension
deployed that
will help with checking against 3rd party databases
17:23:33 <Lydia_WMDE> the result of constraint checks and checks
against 3rd
party databases will then be used to display little indicators next to a
statement in case it is problematic
17:23:47 <Lydia_WMDE> i hope this way more people become aware of
issues and
can help fix them
17:24:35 <sjoerddebruin> Do you have any names of databases that are
supported? :)
17:24:59 <Lydia_WMDE> sjoerddebruin: in the first version the german
national library. it can be extended later

I know Freebase is deemed to be nasty and unreliable, but is MusicBrainz
considered trustworthy enough to import directly or will its facts
need to
be dripped through the primary source soda straw one at a time too?

The primary sources tool and the extension that helps us check against
other databases are two independent things.
Imports from Musicbrainz have been happening since a very long time


Wikidata mailing list

Wikidata mailing list