Thanks @Benjamin for the pointers! I completely agree with @Tom.
I've also been researching techniques for crowdsourcing micro-tasks, mostly for NLP activities like frame semantics annotation: http://www.aclweb.org/anthology/P13-2130 http://ceur-ws.org/Vol-1030/paper-03.pdf
I found out that the crowd of paid workers can really make the difference, even for such difficult and subjective tasks.
So here are my 2 cents to get the best out of it: 1. Extreme care for quality check mechanisms: for instance, the CrowdFlower.com platform has a facility that allows to automatically discard untrusted workers; 2. The micro-task must be atomic, i.e., not containing multiple sub-tasks; 3. The UI design is always crucial: simple words, clear examples, avoid screen scrolling.
Cheers!
On 7/18/15 2:00 PM, wikidata-request@lists.wikimedia.org wrote:
Date: Fri, 17 Jul 2015 13:42:55 -0400 From: Tom Morristfmorris@gmail.com To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Freebase is dead, long live :BaseKB Message-ID: CAE9vqEF3+xQtkbBiiV0Co2Lz8_HMvKhTh=3DwHyZ2UP0PK=CmQ@mail.gmail.com Content-Type: text/plain; charset="utf-8"
3,000 judgments per person per day sounds high to me, particularly on a sustained basis, but it really depends on the type of task. Some of the tasks were very simple with custom high performance single purpose "games" designed around them. For example, Genderizer presented a person's information and allowed choices of Male, Female, Other, and Skip. Using arrow key bindings for the four choices to allow quick selection without moving one's hand, pipelining preloading the next topic in the background, and allowing votes to be undone in case of error were all features which allowed voters to make choices very quickly.
The figures quoted in the paper below (18 seconds per judgment) work out to more like 1,800 judgments per eight hour day. They collected 2.3 million judgments over the course of a year from 555 volunteers (1.05 million judgments) and 84 paid workers (1.25 million).
On Fri, Jul 17, 2015 at 12:35 PM, Benjamin Goodben.mcgee.good@gmail.com wrote:
They wrote a really insightful paper about how their processes for large-scale data curation worked. Among may other things, they investigated mechanical turk 'micro tasks' versus hourly workers and generally found the latter to be more cost effective.
"The Anatomy of a Large-Scale Human Computation Engine" http://wiki.freebase.com/images/e/e0/Hcomp10-anatomy.pdf
The full citation, in case someone needs to track it down, is:
Kochhar, Shailesh, Stefano Mazzocchi, and Praveen Paritosh. "The anatomy of a large-scale human computation engine." *Proceedings of the acm sigkdd workshop on human computation*. ACM, 2010.
There's also a slide presentation by the same name which presents some additional information: http://www.slideshare.net/brixofglory/rabj-freebase-all-5049845
Praveen Paritosh has written a number of papers on the topic of human computation, if you're interested in that (I am!): https://scholar.google.com/citations?user=_wX4sFYAAAAJ&hl=en&oi=sra