Re: [Wikidata] from Freebase to Wikidata: the great migration

29 Feb 2016

Hi all,

thank you for the interest in the primary sources tool!

I wanted to make sure that there are no false expectations. Google has
committed to deliver the initial tool. Thanks to Thomas P’s internship and
support from Thomas S and Sebastian, and with the release of the data, the
code, the paper, and all services running on Wikimedia infrastructure, we
have achieved that milestone. The tool was developed as open source, in
order to allow the community to continue to mold it and to invest in it as
the community sees warranted.

I am particularly thankful to Marco Fossati for his work in creating
further datasets. Thomas S has in the last few days cleaned up the issue
list and merged pull requests. Thank you, Thomas! We are all very thankful
for the pull requests, in particular to Thomas P, Wieland Hoffmann, and Tom
Morris. In general, we plan to keep the tool up as far as our time allows,
and continue to merge such requests, but we have no concrete plans of
extending its functionality right now.

We are very grateful to everyone contributing to the project, or using the
tool. If anyone wants to take over the project, we would invite you to
contribute a bit for a while, and then let’s discuss about it. I would be
thrilled to see this tool develop.

As a reminder, a lot of data has been released under CC0. We invite all to
play around with the data and see if there are slices of the data that can
be directly uploaded to Wikidata, as Gerard suggests.

If there are any questions, we’ll try to answer them. Again, thanks
everyone!

Cheers,
Denny

On Tue, Feb 23, 2016 at 11:08 AM Tom Morris &lt;tfmorris(a)gmail.com&gt; wrote:

...
  On Tue, Feb 23, 2016 at 1:52 PM, Stas Malyshev
&lt;smalyshev(a)wikimedia.org&gt;
 wrote:

  As Gerard has pointed out before, he prefers to
re-enter statements
 instead of approving them. This means that the real number of "imported"
 statements is higher than what is shown in the dashboard (how much so
 depends on how many statements Gerard and others with this approach have
 added). It seems that one should rather analyse the number of statements 
 Yes, I do that sometimes too - if there is a statement saying "spouse:
 X" on wikidata, and statement in Freebase saying the same but with the
 start date, or the Freebase one has more precise date than the Wikidata
 one, such as full date instead of just year, I will modify the original
 statement and reject the Freebase one. 

 I filed a bug report for this yesterday:
 https://github.com/google/primarysources/issues/73
 I'll add the information about more precise qualifiers, since I didn't
 address that part.

  I'm not sure this is the best
 practice with regard to tracking numbers but it's easiest and even if my
 personal numbers do not matter too much I imagine other people do this
 too. So rejection does not really mean the data was not entered - it may
 mean it was entered in a different way. Sometimes also while the data is
 already there, the reference is not, so the reference gets added.

 Even if you don't care about your personal numbers, I'd argue that not
 being able to track the quality of data sources feeding the Primary Sources
 tool is an issue.  It's valuable to not only measure quality for entire
 data sets, but also for particular slices of them since data sources, at
 least large ones like Freebase, are rarely homogenous in quality.

 It's also clearly an issue that the tool is so awkward that people are
 working around it instead of having it help them.

 Tom
 _______________________________________________
 Wikidata mailing list
 Wikidata(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] from Freebase to Wikidata: the great migration