Re: [Wikidata] No links, wrong data: Scotland's orphans need help

7 Jun 2015

On 07.06.2015 18:29, Magnus Manske wrote:
...
  One question remaining is: Should there be a
difference between
 "human-verified" and "bot-verified"? A bot can check if e.g. the
label
 (or the words in the label) occur on the page at the URL to check, but
 it can't know for sure. Human review is more reliable, but vastly slower
 and not likely to happen for many/most such statements. Two different
 properties could act as different confidence levels. But maybe I'm just
 over-engineering this ;-) 
It depends. For structured data sources, a bot should be able to do a 
thorough verification (possibly better than a human), e.g., by comparing 
name, birthdate and deathdate of a person at once. I would focus on 
these cases first since we have enough of them ;-)

For cases where a bot con only make a guess, it might be better to add a 
human to the loop, as in your (truly amazing!) sourcerer game. The game 
also shows that it may depend on the items how well this approach works, 
since text matches are sometimes completely meaningless (e.g., "Human 
parent taxon homo" can not be verified by looking for "Homo" since every 
page that might contain this fact also mentions "Homo sapiens" many 
times). For such difficult cases, I am not sure if a bot-defined 
information "looked correct, but I am not sure" would really be very 
helpful. It depends ;-)

Cheers,

Markus

...

 On Sun, Jun 7, 2015 at 4:19 PM Markus Krötzsch
 &lt;markus(a)semantic-mediawiki.org <mailto:markus@semantic-mediawiki.org>>
 wrote:

     Coming back to Magnus's suggestion ... I think the existing property
     "retrieved" (P813) could be used for this "last verified on"
property,
     that is, for setting the time a which some external reference was last
     compared to a claim in Wikidata.

     Magnus also pointed out that many external IDs are "self-verifying" in
     that they are their own reference. The situation is somewhat similar for
     homepages. Should we adopt the practice of giving a single retrieved
     value (without any further information) as the reference for such cases?

     Adding P813 dates more widely would also open up new ways of maintaining
     data, since one would have a way to filter statements by how long ago
     they had last been checked.

     Best wishes,

     Markus

     On 03.06.2015 15:56, Markus Krötzsch wrote:
  On 03.06.2015 13:57, Magnus Manske wrote:
> Maybe there is a case to separate import and verification here?
>
> There are many statements in Wikidata nowadays, but they get really
> "trustworthy" through references (other than "imported from     
Wikipedia").
 > But for external IDs, references are
superfluous; they are their own
> reference, by definition. So how about marking IDs with a     
"verified" (or
 > "last verified on") qualifier? Much
of such work could be done      by bots;
 > we could then filter the problematic ones out
for manual      verification.

 As we have no control over external lists, this would have to be
 re-checked ever so often; but, again bots to the rescue.

 Yes, I fully support this proposal.

 What do you think about making "last verified on" not a qualifier but
 (part of) the reference information? The reference could state      where the
  bot has looked up the ID and give a time. This
would be somewhat      similar
  to what is now used in Freebase Ids, e.g., in
 https://www.wikidata.org/wiki/Q42.

 In general, it might be useful to have such a "last verified on"
 property that can be added to arbitrary references. There are      many other
  uses for this. One common case would be that a
user has changed the
 value without even being aware of the reference -- then one would be
 able to detect this automatically by comparing the last modification
 time with the "last verified on" date.

 Putting the "last verified on" into the references also makes it
 possible to have different dates for different references there.

 Regards,

 Markus

     _______________________________________________
     Wikidata mailing list
     Wikidata(a)lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
     https://lists.wikimedia.org/mailman/listinfo/wikidata

 _______________________________________________
 Wikidata mailing list
 Wikidata(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] No links, wrong data: Scotland's orphans need help