New subject: Importing Freebase (Was: next Wikidata office hour)

8 Sep 2015

Also, Freebase users themselves who did daily, weekly work.... some
where passing users, some tried harder, but made lots of erroneous
entries (battling against our Experts at times).  We could probably
provide a list of those sorta community blacklisted users who's data
submissions should probably not be trusted.

+1 for looking at better maintained specific properties.
+1 for being cautious for some Freebase usernames and their entries.
+1 for trusting wholesale all of the Freebase Experts submissions.
We policed each other quite well.

Thad
+ThadGuidry <https://www.google.com/+ThadGuidry>

On Fri, Sep 25, 2015 at 11:45 AM, Jason Douglas
<jasondouglas@google.com <mailto:jasondouglas@google.com>> wrote:

    > It would indeed be interesting to see which percentage of proposals are
    > being approved (and stay in Wikidata after a while), and whether there
    > is a pattern (100% approval on some type of fact that could then be
    > merged more quickly; or very low approval on something else that would
    > maybe better revisited for mapping errors or other systematic problems).

    +1, I think that's your best bet. Specific properties were much
    better maintained than others -- identify those that meet the
    bar for wholesale import and leave the rest to the primary
    sources tool.

    On Thu, Sep 24, 2015 at 4:03 PM Markus Krötzsch
    <markus@semantic-mediawiki.org
    <mailto:markus@semantic-mediawiki.org>> wrote:

        On 24.09.2015 23:48, James Heald wrote:
         > Has anybody actually done an assessment on Freebase and
        its reliability?
         >
         > Is it *really* too unreliable to import wholesale?

          From experience with the Primary Sources tool proposals,
        the quality is
        mixed. Some things it proposes are really very valuable, but
        other
        things are also just wrong. I added a few very useful facts
        and fitting
        references based on the suggestions, but I also rejected
        others. Not
        sure what the success rate is for the cases I looked at, but
        my feeling
        is that some kind of "supervised import" approach is really
        needed when
        considering the total amount of facts.

        An issue is that it is often fairly hard to tell if a
        suggestion is true
        or not (mainly in cases where no references are suggested to
        check). In
        other cases, I am just not sure if a fact is correct for the
        property
        used. For example, I recently ended up accepting "architect:
        Charles
        Husband" for Lovell Telescope (Q555130), but to be honest I
        am not sure
        that this is correct: he was the leading engineer contracted
        to design
        the telescope, which seems different from an architect; no
        official web
        site uses the word "architect" it seems; I could not find a
        better
        property though, and it seemed "good enough" to accept it
        (as opposed to
        the post code of the location of this structure, which
        apparently was
        just wrong).

         >
         > Are there any stats/progress graphs as to how the actual
        import is in
         > fact going?

        It would indeed be interesting to see which percentage of
        proposals are
        being approved (and stay in Wikidata after a while), and
        whether there
        is a pattern (100% approval on some type of fact that could
        then be
        merged more quickly; or very low approval on something else
        that would
        maybe better revisited for mapping errors or other
        systematic problems).

        Markus

         >
         >    -- James.
         >
         >
         > On 24/09/2015 19:35, Lydia Pintscher wrote:
         >> On Thu, Sep 24, 2015 at 8:31 PM, Tom Morris
        <tfmorris@gmail.com <mailto:tfmorris@gmail.com>> wrote:
         >>>> This is to add MusicBrainz to the primary source tool,
        not anything
         >>>> else?
         >>>
         >>>
         >>> It's apparently worse than that (which I hadn't
        realized until I
         >>> re-read the
         >>> transcript).  It sounds like it's just going to
        generate little warning
         >>> icons for "bad" facts and not lead to the recording of
        any new facts
         >>> at all.
         >>>
         >>> 17:22:33 <Lydia_WMDE> we'll also work on getting the
        extension
         >>> deployed that
         >>> will help with checking against 3rd party databases
         >>> 17:23:33 <Lydia_WMDE> the result of constraint checks
        and checks
         >>> against 3rd
         >>> party databases will then be used to display little
        indicators next to a
         >>> statement in case it is problematic
         >>> 17:23:47 <Lydia_WMDE> i hope this way more people
        become aware of
         >>> issues and
         >>> can help fix them
         >>> 17:24:35 <sjoerddebruin> Do you have any names of
        databases that are
         >>> supported? :)
         >>> 17:24:59 <Lydia_WMDE> sjoerddebruin: in the first
        version the german
         >>> national library. it can be extended later
         >>>
         >>>
         >>> I know Freebase is deemed to be nasty and unreliable,
        but is MusicBrainz
         >>> considered trustworthy enough to import directly or
        will its facts
         >>> need to
         >>> be dripped through the primary source soda straw one at
        a time too?
         >>
         >> The primary sources tool and the extension that helps us
        check against
         >> other databases are two independent things.
         >> Imports from Musicbrainz have been happening since a
        very long time
         >> already.
         >>
         >>
         >> Cheers
         >> Lydia
         >>
         >
         >
         > _______________________________________________
         > Wikidata mailing list
         > Wikidata@lists.wikimedia.org
        <mailto:Wikidata@lists.wikimedia.org>
         > https://lists.wikimedia.org/mailman/listinfo/wikidata

        _______________________________________________
        Wikidata mailing list
        Wikidata@lists.wikimedia.org
        <mailto:Wikidata@lists.wikimedia.org>
        https://lists.wikimedia.org/mailman/listinfo/wikidata

    _______________________________________________
    Wikidata mailing list
    Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata