On Human computation (was Re: Freebase is dead, long live :BaseKB) - Wikidata

20 Jul 2015

Thanks @Benjamin for the pointers!
I completely agree with @Tom.

I've also been researching techniques for crowdsourcing micro-tasks, 
mostly for NLP activities like frame semantics annotation:
http://www.aclweb.org/anthology/P13-2130
http://ceur-ws.org/Vol-1030/paper-03.pdf

I found out that the crowd of paid workers can really make the 
difference, even for such difficult and subjective tasks.

So here are my 2 cents to get the best out of it:
1. Extreme care for quality check mechanisms: for instance, the 
CrowdFlower.com platform has a facility that allows to automatically 
discard untrusted workers;
2. The micro-task must be atomic, i.e., not containing multiple sub-tasks;
3. The UI design is always crucial: simple words, clear examples, avoid 
screen scrolling.

Cheers!

On 7/18/15 2:00 PM, wikidata-request(a)lists.wikimedia.org wrote:
...
  Date: Fri, 17 Jul 2015 13:42:55 -0400
 From: Tom Morris&lt;tfmorris(a)gmail.com&gt;
 To: "Discussion list for the Wikidata project."
 	&lt;wikidata(a)lists.wikimedia.org&gt;
 Subject: Re: [Wikidata] Freebase is dead, long live :BaseKB
 Message-ID:
 	&lt;CAE9vqEF3+xQtkbBiiV0Co2Lz8_HMvKhTh=3DwHyZ2UP0PK=CmQ(a)mail.gmail.com&gt;
 Content-Type: text/plain; charset="utf-8"

 3,000 judgments per person per day sounds high to me, particularly on a
 sustained basis, but it really depends on the type of task.  Some of the
 tasks were very simple with custom high performance single purpose "games"
 designed around them.  For example, Genderizer presented a person's
 information and allowed choices of Male, Female, Other, and Skip.  Using
 arrow key bindings for the four choices to allow quick selection without
 moving one's hand, pipelining preloading the next topic in the background,
 and allowing votes to be undone in case of error were all features which
 allowed voters to make choices very quickly.

 The figures quoted in the paper below (18 seconds per judgment) work out to
 more like 1,800 judgments per eight hour day.  They collected 2.3 million
 judgments over the course of a year from 555 volunteers (1.05 million
 judgments) and 84 paid workers (1.25 million).

 On Fri, Jul 17, 2015 at 12:35 PM, Benjamin Good&lt;ben.mcgee.good(a)gmail.com&gt;
 wrote:

  >They wrote a really insightful paper about
how their processes for
 >large-scale data curation worked.  Among may other things, they
 >investigated mechanical turk 'micro tasks' versus hourly workers and
 >generally found the latter to be more cost effective.
 >
 >"The Anatomy of a Large-Scale Human Computation Engine"
 >http://wiki.freebase.com/images/e/e0/Hcomp10-anatomy.pdf
 >  The full citation, in case someone needs to track it down, is:

 Kochhar, Shailesh, Stefano Mazzocchi, and Praveen Paritosh. "The anatomy of
 a large-scale human computation engine." *Proceedings of the acm sigkdd
 workshop on human computation*. ACM, 2010.

 There's also a slide presentation by the same name which presents some
 additional information:
 http://www.slideshare.net/brixofglory/rabj-freebase-all-5049845

 Praveen Paritosh has written a number of papers on the topic of human
 computation, if you're interested in that (I am!):
 https://scholar.google.com/citations?user=_wX4sFYAAAAJ&hl=en&oi=sra 
-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j