On Thu, Jun 4, 2015 at 12:56 PM, Markus Krötzsch <markus@semantic-mediawiki.org> wrote:
On 04.06.2015 11:05, Dimitris Kontokostas wrote:
We are currently working on something that could be extended to be used
as a source of finding data conflicts / import.
I have to check if this can be integrated with the primary sources tool.
I hope we have something ready in the next couple of weeks and I'll get
back at this thread.

Great, this sounds like a plan. The work on the primary sources tool will take a few more months before it will really be ready for prime time. If this is too long, there might be more short-term solutions (such as a Wikidata game), but you'd have to ask the people running this in each case.

I'll show you  what we can provide and you can suggest any options
 
Another question: can DBpedia extract references from Wikipedia articles too? If this would be possible, it might be feasible to guess and suggest a reference (or a list of references). Especially with things like date of death, one would expect that references have a publication date very close to (but strictly after) the event, which could narrow down the choices very much.

We don't extract them for now, although I think we could relatively easily. The problem in this case would be that we cannot associate references with facts. The DBpedia Information Extraction Framework is quite module and can be easily extended with new extractors but it is hard to make these extractors "talk to each other". 
So we could easily get something like the following
dbp:A dbo:birthDate "..."
dbp:A dbo:deahthDate "..."
dbp:A dbo:reference dbp:r1 # and maybe " dbp:r1 ....something else" depending on the modeling
dbp:A dbo:reference dbp:r2 

but not sure if this solves your problem

Cheers,
Dimitris
 

Cheers,

Markus


Best,
Dimitris

On Thu, Jun 4, 2015 at 11:49 AM, Gerard Meijssen
<gerard.meijssen@gmail.com <mailto:gerard.meijssen@gmail.com>> wrote:

    Hoi,
    Markus with all due respect, we have a LOT of data in Wikidata that
    is plain wrong. When we add the missing data from DBpedia it is of a
    higher quality than what we have. Insisting that it first needs to
    be validated is foolish. It is not done for any of the work we do.
    All our bots make use of Wikipedia and in this DBpedia is no different.

    I do agree that it makes sense to verify the data that is different.
    But even so. When Wikidata says 1929 and DBpedia says 7-June-1929
    our practise has been to remove the 1929 for the more precise data.

    Let us be pragmatic and improve our data and start with what is missing.
    Thanks,
         GerardM

    On 4 June 2015 at 10:31, Markus Krötzsch
    <markus@semantic-mediawiki.org
    <mailto:markus@semantic-mediawiki.org>> wrote:

        Hi Dmitris,

        Interesting situation. If you have contradictory data from
        several templates, then the challenge will be to find out which
        information is correct for importing it to Wikidata. Could your
        dataset maybe become an input to the primary sources tool [1]?
        Then Wikidata users could help to clean the dataset and try to
        find references (as you know, references are quite important for
        Wikidata, but it would really be asking too much of DBpedia to
        provide these).

        This could be a viable strategy to merge DBpedia data into
        Wikidata. This email was only about person-related data, but one
        could do this for any kind of dataset where the information in
        DBpedia is of relatively high quality. I don't know exactly what
        the primary sources tool needs as input (it is still beta), but
        I think it mainly requires that a decent quality set of
        candidate statements is extracted and provided in some suitable
        format.

        As a first step, it might make sense to do a scan to see how
        many date-of-death (or whatever) statements in DBpedia are not
        yet found in Wikidata. If it is a small dataset (e.g., only a
        subset of the people who have died in the last year), then maybe
        one could also add and verify it in another way, not going
        through primary sources. But especially for recent deaths, there
        might be a great variety of sources (esp. newspaper articles)
        that are not easy to find without user support.

        Regards,

        Markus

        [1] https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool



        On 04.06.2015 09 <tel:04.06.2015%2009>:56, Dimitris Kontokostas
        wrote:



            On Thu, Jun 4, 2015 at 1:18 AM, Markus Krötzsch
            <markus@semantic-mediawiki.org
            <mailto:markus@semantic-mediawiki.org>
            <mailto:markus@semantic-mediawiki.org
            <mailto:markus@semantic-mediawiki.org>>>

            wrote:

                 On 03.06.2015 22 <tel:03.06.2015%2022>:44, Gerard

            Meijssen wrote:

                     Hoi,
                     The Dutch indicated their willingness to add the
            dead to
                     Wikidata ... I
                     add quite a few dead from other countries and
            because of Jura1
                     Brazilians who died in 2015 have an added significance.

                     Given that we CAN produce lists like this, it makes
            sense to
                     reconsider
                     the offer by the fine people from DBpedia and have the
                     information they
                     harvest from Wikipedia added automatically to
            Wikidata.. One
                     reason I
                     pointed out on my recent blogpost..


                 DBpedia is getting this information from the contents
            of the
                 template Persondata as used on Wikipedia [1]. The
            enwiki community
                 just recently decided to maintain this data on Wikidata
            instead. I
                 guess this means that (English) DBpedia will not
            contain this data
                 in the future, unless they import it from Wikidata
            (they are
                 tracking the issue at [2]).


            Note that DBpedia gets person data information both from the
            persondata
            template and from the infobox templates using the mappings wiki.
            We also noted that the data between the two is many times
            out of sync
            (and usually the person data is stalled/wrong because people
            don't know
            it's existence).

            e.g. we have 28K items with double birth dates one from the
            infobox and
            another from persondata.

            select count(*) where {?s dbpedia-owl:birthDate ?b1 ;
            dbpedia-owl:birthDate ?b2 .
            filter (?b1 != ?b2 && ?b1 < ?b2)}
            http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+count%28*%29+where+%7B%3Fs+dbpedia-owl%3AbirthDate+%3Fb1+%3B+dbpedia-owl%3AbirthDate+%3Fb2+.%0D%0Afilter+%28%3Fb1+%21%3D+%3Fb2+%26%26+%3Fb1+%3C+%3Fb2%29%7D&format=text%2Fhtml&timeout=30000&debug=on

            The persondata template is used in German Wikipedia as well. The
            following release has ~ 2.2M triples coming from the german
            persondata
            template (which iirc has the same problems as the english)

            Best,
            Dimitris


                 So you see, times are changing quickly ... but overall
            I hope that
                 this is still solving the problem you identified, in
            fact in a much
                 more direct way than one might have hoped for :-).

                 DBpedia may still play a role. I don't know how exactly
            the enwiki
                 community is planning to implement the move from
            Persondata to
                 Wikidata. It could be that DBpedia is the only project
            extracting
                 this data. So in a way, your suggestion might be a
            great idea,
                 though not as a long-term data maintenance plan but as
            a one-time
                 help for migration.

                 To support data maintenance further, it would make
            sense to use bots
                 for synching with authority files. These files also
            contain death
                 dates and they can even be used as a valid reference.

                 Regards,

                 Markus

                 [1] https://en.wikipedia.org/wiki/Template:Persondata
                 [2]
            https://github.com/dbpedia/extraction-framework/issues/397

                     Thanks,
                              GerardM

            http://ultimategerardm.blogspot.nl/2015/06/wikidata-jurandyr-noronha-died-in-2015.html

                     On 3 June 2015 at 07:16, Gerard Meijssen
                     <gerard.meijssen@gmail.com
            <mailto:gerard.meijssen@gmail.com>
            <mailto:gerard.meijssen@gmail.com
            <mailto:gerard.meijssen@gmail.com>>
                     <mailto:gerard.meijssen@gmail.com
            <mailto:gerard.meijssen@gmail.com>

                     <mailto:gerard.meijssen@gmail.com
            <mailto:gerard.meijssen@gmail.com>>>> wrote:

                          Hoi,
                          Jura1 created a wonderful list of people who
            died in Brazil
                     in 2015
                          [1]. It is  a page that may update regularly
            from Wikidata
                     thanks to
                          the ListeriaBot. Obviously, there may be a few
            more because
                     I am
                          falling ever more behind with my quest for
            registering
                     deaths in 2015.

                          I have copied his work and created a page for
            people who
                     died in the
                          Netherlands in 2015 [2]. It is trivially easy
            to do this
                     and, the
                          result is great. The result looks great, it
            can be used for any
                          country in any Wikipedia

                          The Dutch Wikipedia indicated that they
            nowadays maintain
                     important
                          metadata at Wikidata. I am really happy that
            we can
                     showcase their
                          work. It is important work because as someone
            reminded me
                     at some
                          stage, this is part of what amounts to the
            policy of living
                     people...

                          Thanks,
                                 GerardM

                          [1]
            https://www.wikidata.org/wiki/User:Jura1/Recent_deaths_in_Brazil
                          [2]
            https://www.wikidata.org/wiki/User:Jura1/Recent_deaths_in_the_Netherlands




                     _______________________________________________
                     Wikidata mailing list
            Wikidata@lists.wikimedia.org
            <mailto:Wikidata@lists.wikimedia.org>
            <mailto:Wikidata@lists.wikimedia.org
            <mailto:Wikidata@lists.wikimedia.org>>
            https://lists.wikimedia.org/mailman/listinfo/wikidata



                 _______________________________________________
                 Wikidata mailing list
            Wikidata@lists.wikimedia.org
            <mailto:Wikidata@lists.wikimedia.org>
            <mailto:Wikidata@lists.wikimedia.org

            <mailto:Wikidata@lists.wikimedia.org>>
            https://lists.wikimedia.org/mailman/listinfo/wikidata




            --
            Kontokostas Dimitris


            _______________________________________________
            Wikidata mailing list
            Wikidata@lists.wikimedia.org
            <mailto:Wikidata@lists.wikimedia.org>
            https://lists.wikimedia.org/mailman/listinfo/wikidata



        _______________________________________________
        Wikidata mailing list
        Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
        https://lists.wikimedia.org/mailman/listinfo/wikidata



    _______________________________________________
    Wikidata mailing list
    Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/wikidata




--
Kontokostas Dimitris


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



--
Kontokostas Dimitris