Re: [Wikidata] Freebase is dead, long live :BaseKB

17 Jul 2015

I know Freebase used oDesk.  Note the number in question is  3000
judgements per person per day I've run tasks on Mechanical Turk and also I
make my own judgement sets for various things and I'd agree with that rate;
 that comes to 9.6 seconds per judgement which I can believe.  If you are
that fast you can make a living of it and never have to get out of your
pyjamas,  but as a manager you have to do something about people who do
huge amounts of fast but barely acceptable work.

  Note that they had $57M of funding

https://www.crunchbase.com/organization/metawebtechnologies

and if the fully loaded cost of those FTE equivalents was $50,000 via
oDesk,  it would cost $5 M to get 100 million facts processed.  So
practically they could have got a lot done.  Metaweb and oDesk had
interlocking directorates

https://www.crunchbase.com/organization/metawebtechnologies/insights/curren…

so they probably had a great relationship with oDesk,  which would have
helped.

Dealing with "turks" I would estimate that I'd ask each question somewhere
between 2 and 3 times on the average to catch most of the errors and
ambiguous cases and also get an estimate of how many I didn't catch.

On Thu, Jul 16, 2015 at 8:23 PM, Eric Sun &lt;esun(a)cs.stanford.edu&gt; wrote:

...
  > Note that
Freebase did a lot of human curation and we know they could  get
 about 3000 verifications  of facts by "non-experts" a day who were paid for
 their efforts.  That scales out to almost a million facts per FTE per year.

 Where can I found out more about how they were able to do such high-volume
 human curation?  3000/day is a huge number.

 On Thu, Jul 16, 2015 at 5:01 AM, &lt;wikidata-request(a)lists.wikimedia.org&gt;
 wrote:

> Date: Wed, 15 Jul 2015 15:25:27 -0400
> From: Paul Houle &lt;ontology2(a)gmail.com&gt;
> To: "Discussion list for the Wikidata project."
>         &lt;wikidata-l(a)lists.wikimedia.org&gt;
> Subject: [Wikidata] Freebase is dead, long live :BaseKB
> Message-ID:
>         <
> CAE__kdQt55E7k7xHMeuBCu9QrwRKoMU_60NDuYgcTHNkC7DFHA(a)mail.gmail.com&gt;
> Content-Type: text/plain; charset="utf-8"
>
> For those who are interested in the project of getting something out of
> Freebase for use in Wikidata or somewhere else,  I'd like to point out
>
> http://basekb.com/gold/
>
> this a completely workable solution for  running queries out of Freebase
> after the MQL API goes dark.
>
> I have been watching the discussion about the trouble moving Freebase data
> to Wikidata and let me share some thoughts.
>
> First quality is in the eye of the beholder and if somebody defines that
> quality is a matter of citing your sources,  than that is their definition
> of 'quality' and they can attain it.  You might have some other definition
> of quality and be appalled that Wikidata has so little to say about a
> topic
> that has caused much controversy and suffering:
>
> https://www.wikidata.org/wiki/Q284451
>
> there are ways to attain that too.
>
> Part of the answer is that different products are going to be used in
> different places.  For instance,  one person might need 100% coverage of
> books he wants to talk about,  another one might want a really great
> database of ski areas,  etc.
>
> Note that Freebase did a lot of human curation and we know they could get
> about 3000 verifications  of facts by "non-experts" a day who were paid
> for
> their efforts.  That scales out to almost a million facts per FTE per
> year.
>
>
>
> --
> Paul Houle
>
> *Applying Schemas for Natural Language Processing, Distributed Systems,
> Classification and Text Mining and Data Lakes*
>
> (607) 539 6254    paul.houle on Skype   ontology2(a)gmail.com
> https://legalentityidentifier.info/lei/lookup/
> <http://legalentityidentifier.info/lei/lookup/>
>  

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] Freebase is dead, long live :BaseKB