Hi GerardM and Nemo,
it is kind of ok, what Nemo said, because the comparison to pixie dust holds.
We are trying to decentralize a lot, which makes everything seem very vague. Probably the same with Tim Berners-Lee's new project: https://solid.mit.edu/
At first glance they offer the same features as Facebook and Twitter, which makes it hard to believe that they will be successful, the trick is here to provide the right incentives and usefulness, which will make the network effect.
The main problem I see is that data quality follows the pareto-distribution. The more data you have and the better the quality, the harder it gets to be even better. Test-driven validation only makes it more efficient, but does not beat the pareto-distribution. Networked data can help here to enable reuse and kind of cheat pareto, but not beat it. If you crack the incentives/network issue it is pixie dust and makes the thing fly.
Working with data is hard and repetitive. We envision a hub, where everybody can upload data and then useful operations like versioning, cleaning, transformation, mapping, linking, merging, hosting is done
Sounds like Wikidata!
@Nemo: In my experience you can't really upload data to Wikidata. It comes with a lot of barriers. In the beginning, I understood the argument, that you couldn't load DBpedia or Freebase since there were no references. Now I saw stats that half the statements are not referenced anyhow and another third references Wikipedia. https://docs.google.com/presentation/d/1XX-yzT98fglAfFkHoixOI1XC1uwrS6f0u1xj... So in hindsight, Wikidata could have started out with DBpedia easily and would have a much better start and be much more developed.
DBpedia's properties are directly related to the infobox properties as all data is extracted there, which Wikidata aims to cover as well, so a perfect match. So the Wikidata community spend a lot of time adding data that could have been just uploaded right from the start and focused on the references.
There is also Cunningham's law (the inventor of wikis): "the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer." So the extraction errors would have been an incentive to fix them...
Now Wikidata is dealing with this https://en.wikipedia.org/wiki/Wikipedia:Wikidata/2018_Infobox_RfC and we are concerned that it will not reach its goal to the fullest. We are still very interested to collaborate on this and contribute where we can.
All the best, Sebastian
On 15.05.2018 07:59, Gerard Meijssen wrote:
Hoi, We do not provide useful operations like versioning, cleaning, transformation at Wikidata. We do not compare we do not curate at Wikidata.
So when somewhere else they make it their priority and do a better job at it, rejoice, don't mock. The GREAT thing about DBpedia that they /are /willing to collaborate. Thanks, GerardM
On 15 May 2018 at 07:51, Federico Leva (Nemo) <nemowiki@gmail.com mailto:nemowiki@gmail.com> wrote:
Sebastian Hellmann, 08/05/2018 14:29: Working with data is hard and repetitive. We envision a hub, where everybody can upload data and then useful operations like versioning, cleaning, transformation, mapping, linking, merging, hosting is done Sounds like Wikidata! automagically Except this. There is always some market for pixie dust. Federico _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata>
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata