[Wikimedia-l] "Big data" benefits and limitations (relevance: WMF editor engagement, fundraising, and HR practices)

Tim Starling tstarling at wikimedia.org
Thu Jan 3 06:38:11 UTC 2013

On 03/01/13 16:09, George Herbert wrote:
> Laugh all you want, but the best man at my wedding's scalable P2P in
> the cloud company was acquired by Adobe, then he was poached by Skype
> who were poached by Microsoft, and now he's a Very Senior Architect
> spending most of his time flying around the world to far-flung
> offices, architecting and implementing scalable P2P in the cloud.

Flying sucks. Time spent flying should be a measure of failure, not

Anyway, I wouldn't go so far as to deny the existence of
petabyte-sized data sets, or to deny that some organisations derive
value from being able to pass them through CPUs in a reasonable amount
of time. I merely question the value of a mailing list post that says
"hey, big data, we should do that".

Wikipedia's problems are obvious and severe:

* Incivility by established users towards new users
* Capture of articles by self-appointed "owners"
* Sneaky vandalism and misinformation

If you look at the comments section of any online news article about
Wikipedia, you will see these valid criticisms repeated over and over
as reasons why people have stopped contributing to Wikipedia or refuse
to start. The number of active (>5 edits/mo) contributors has declined
from 13000 in January 2007 to 5900 in October 2012.

You don't need "big data" to see what needs to be done.

-- Tim Starling

More information about the Wikimedia-l mailing list