Hoi,
The first reason is obvious. External sources have their issues. They do not necessarily fit our purpose. A case in point is a substance database that indicates that they are registered as a medicine. When it is then known that those substances are proven to be ineffective, what is the value. You do want to differ on those statements and at least indicate that they are proven ineffective.

The second reason is that sources have their issues, biases and I doubt that we would ever hand over control. The best we could do and should do is register differences and source what we believe is true.

It then becomes a two way street. We learn about their pov and they about ours. They can change their data based on our input. When they do not, it is their database, their effort, their priority.

Yet another reason is that the same data is known in many databases. Some only register a year of birth and not a date. Obviously we do not need a year when we have the complete date.

When several sources, unrelated sources share the same data, we want to track the differences between them all and indeed know what is what.
Thanks,
      GerardM

On 11 June 2016 at 11:55, Sandra Fauconnier <sandra.fauconnier@gmail.com> wrote:
Why? I would enthusiastically welcome an automated system that allows us to stay up to date with all the authority databases for which we have properties.
At this moment, I have a hunch that we are already hopelessly out of date with most of the ones we started to add a few years ago. 
I’d like to hear other suggestions that provide broad scalable solutions for this (not just ‘I’m tracking this one dataset that is important to me’, but ‘this allows us to track all hundreds of external datasets we have some kind of identifiers for’).

Sandra

On 11 Jun 2016, at 09:32, Gerard Meijssen <gerard.meijssen@gmail.com> wrote:

Hoi,
Resourcesync is unlikely to be adopted by Wikidata for adopting changes from elsewhere. If others want to share data FROM WIkidata there is no problem with providing Resourcesync.

What I do not completely understand is how its mechanism for indicating changes may be used. When it can be used to generate reports so that people can actually see the differences it could be really important to improve the quality.

The notion that we can just copy in data or change data based on an "authorised" source is problematic.
Thanks,
      GerardM

On 10 June 2016 at 21:12, Sandra Fauconnier <sandra.fauconnier@gmail.com> wrote:
I have recently read an interview with Herbert Van de Sompel, who among others has worked on the OAI-PMH and the Memento project (for those for whom that rings a bell).

Recently his team has developed an initiative called ResourceSync, that seems to be addressing exactly this - keeping distributed databases on the web mutually up to date.
It’s the closest thing I’ve ever seen that seems to address what we (and the entire interlinked web) would need in this area. I might have missed other initiatives, but this one gave me a big AHA moment!

Here’s a short video that explains the principle: https://www.youtube.com/watch?v=ASQ4jMYytsA

In the interview I read. Herbert said that it didn’t see wide adoption yet though. I can imagine that, if the Wikimedia projects’ software would adopt this, it might have a snowball effect.

Best, Sandra

On 10 Jun 2016, at 20:43, Benjamin Good <ben.mcgee.good@gmail.com> wrote:

Hi Julie,

We've thought a lot about this, but not done anything formally yet.  There is an example of this happening to improve the disease ontology presented in this paper [1].

Mechanically, parties interested in a particular swath of data linked to their resource could set up repeated SPARQL queries to watch for changes.  Beyond that, the core mediawiki API could be used to create alerts when new discussions are started on articles or items of interest.  

At some point we hope to produce a reporting site that would aggregate this kind of information in our domain (feedback and changes  by the community) as well as changes by our bots and provide reports back to the primary sources and to whoever else was interested.  (Maybe we will see a start on that this summer..)  This hasn't become a priority yet because we haven't yet generated the community scope to make it a really valuable source of input to the original databases.  



On Fri, Jun 10, 2016 at 11:31 AM, Julie McMurry <mcmurry.julie@gmail.com> wrote:
It is great that WikiData provides a way for data to be curated in a crowd-sourced way.
It would be even better if changes (especially corrections) could be communicated back to the original source so that all could benefit.

Has this been discussed previously? Considered?

Julie

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata