Note that Freebase did a lot of human curation and we know they could get
about 3000 verifications of facts by "non-experts" a day who were paid for their efforts. That scales out to almost a million facts per FTE per year.
Where can I found out more about how they were able to do such high-volume human curation? 3000/day is a huge number.
On Thu, Jul 16, 2015 at 5:01 AM, wikidata-request@lists.wikimedia.org wrote:
Date: Wed, 15 Jul 2015 15:25:27 -0400 From: Paul Houle ontology2@gmail.com To: "Discussion list for the Wikidata project." wikidata-l@lists.wikimedia.org Subject: [Wikidata] Freebase is dead, long live :BaseKB Message-ID: < CAE__kdQt55E7k7xHMeuBCu9QrwRKoMU_60NDuYgcTHNkC7DFHA@mail.gmail.com> Content-Type: text/plain; charset="utf-8"
For those who are interested in the project of getting something out of Freebase for use in Wikidata or somewhere else, I'd like to point out
this a completely workable solution for running queries out of Freebase after the MQL API goes dark.
I have been watching the discussion about the trouble moving Freebase data to Wikidata and let me share some thoughts.
First quality is in the eye of the beholder and if somebody defines that quality is a matter of citing your sources, than that is their definition of 'quality' and they can attain it. You might have some other definition of quality and be appalled that Wikidata has so little to say about a topic that has caused much controversy and suffering:
https://www.wikidata.org/wiki/Q284451
there are ways to attain that too.
Part of the answer is that different products are going to be used in different places. For instance, one person might need 100% coverage of books he wants to talk about, another one might want a really great database of ski areas, etc.
Note that Freebase did a lot of human curation and we know they could get about 3000 verifications of facts by "non-experts" a day who were paid for their efforts. That scales out to almost a million facts per FTE per year.
-- Paul Houle
*Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes*
(607) 539 6254 paul.houle on Skype ontology2@gmail.com https://legalentityidentifier.info/lei/lookup/ http://legalentityidentifier.info/lei/lookup/