Hoi, When you analyse the statistics, it shows how bad the current state of affairs is. Slightly over one in a thousanths of the content of the primary sources tool has been included.
Markus, Lydia and myself agree that the content of Freebase may be improved. Where we differ is that the same can be said for Wikidata. It is not much better and by including the data from Freebase we have a much improved coverage of facts. The same can be said for the content of DBpedia probably other sources as well.
I seriously hate this procrastination and the denial of the efforts of others. It is one type of discrimination that is utterly deplorable.
We should concentrate on comparing Wikidata with other sources that are maintained. We should do this repeatedly and concentrate on workflows that seek the differences and provide workflows that help our community to improve what we have. What we have is the sum of all available knowledge and by splitting it up, we are weakened as a result. Thanks, GerardM
On 26 September 2015 at 03:32, Thad Guidry thadguidry@gmail.com wrote:
Also, Freebase users themselves who did daily, weekly work.... some where passing users, some tried harder, but made lots of erroneous entries (battling against our Experts at times). We could probably provide a list of those sorta community blacklisted users who's data submissions should probably not be trusted.
+1 for looking at better maintained specific properties. +1 for being cautious for some Freebase usernames and their entries. +1 for trusting wholesale all of the Freebase Experts submissions. We policed each other quite well.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
On Fri, Sep 25, 2015 at 11:45 AM, Jason Douglas jasondouglas@google.com wrote:
It would indeed be interesting to see which percentage of proposals are being approved (and stay in Wikidata after a while), and whether there is a pattern (100% approval on some type of fact that could then be merged more quickly; or very low approval on something else that would maybe better revisited for mapping errors or other systematic problems).
+1, I think that's your best bet. Specific properties were much better maintained than others -- identify those that meet the bar for wholesale import and leave the rest to the primary sources tool.
On Thu, Sep 24, 2015 at 4:03 PM Markus Krötzsch < markus@semantic-mediawiki.org> wrote:
On 24.09.2015 23:48, James Heald wrote:
Has anybody actually done an assessment on Freebase and its
reliability?
Is it *really* too unreliable to import wholesale?
From experience with the Primary Sources tool proposals, the quality is mixed. Some things it proposes are really very valuable, but other things are also just wrong. I added a few very useful facts and fitting references based on the suggestions, but I also rejected others. Not sure what the success rate is for the cases I looked at, but my feeling is that some kind of "supervised import" approach is really needed when considering the total amount of facts.
An issue is that it is often fairly hard to tell if a suggestion is true or not (mainly in cases where no references are suggested to check). In other cases, I am just not sure if a fact is correct for the property used. For example, I recently ended up accepting "architect: Charles Husband" for Lovell Telescope (Q555130), but to be honest I am not sure that this is correct: he was the leading engineer contracted to design the telescope, which seems different from an architect; no official web site uses the word "architect" it seems; I could not find a better property though, and it seemed "good enough" to accept it (as opposed to the post code of the location of this structure, which apparently was just wrong).
Are there any stats/progress graphs as to how the actual import is in fact going?
It would indeed be interesting to see which percentage of proposals are being approved (and stay in Wikidata after a while), and whether there is a pattern (100% approval on some type of fact that could then be merged more quickly; or very low approval on something else that would maybe better revisited for mapping errors or other systematic problems).
Markus
-- James.
On 24/09/2015 19:35, Lydia Pintscher wrote:
On Thu, Sep 24, 2015 at 8:31 PM, Tom Morris tfmorris@gmail.com
wrote:
> This is to add MusicBrainz to the primary source tool, not anything > else?
It's apparently worse than that (which I hadn't realized until I re-read the transcript). It sounds like it's just going to generate little
warning
icons for "bad" facts and not lead to the recording of any new facts at all.
17:22:33 <Lydia_WMDE> we'll also work on getting the extension deployed that will help with checking against 3rd party databases 17:23:33 <Lydia_WMDE> the result of constraint checks and checks against 3rd party databases will then be used to display little indicators next
to a
statement in case it is problematic 17:23:47 <Lydia_WMDE> i hope this way more people become aware of issues and can help fix them 17:24:35 <sjoerddebruin> Do you have any names of databases that are supported? :) 17:24:59 <Lydia_WMDE> sjoerddebruin: in the first version the german national library. it can be extended later
I know Freebase is deemed to be nasty and unreliable, but is
MusicBrainz
considered trustworthy enough to import directly or will its facts need to be dripped through the primary source soda straw one at a time too?
The primary sources tool and the extension that helps us check against other databases are two independent things. Imports from Musicbrainz have been happening since a very long time already.
Cheers Lydia
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata