On Thu, Sep 24, 2009 at 12:50 PM, David Gerard dgerard@gmail.com wrote:
2009/9/24 Aryeh Gregor Simetrical+wikilist@gmail.com:
Do these statistics take into account things like vandalism reversions? Also, how do they handle anonymous users -- are they summed up by edit count like anyone else? I distinctly remember seeing a study conclude that most of the actual content comes from users with few edits, but I can't recall where or how long ago (possibly two or three years).
Aaron Swartz.
http://www.aaronsw.com/weblog/whowriteswikipedia
Most of the edits are done by a very small group of regulars.
But most of the actual text is contributed by drive-by contributors and then beaten into shape by the regulars.
Yes, I have seen that too. My analysis, using a blame engine against a Wikipedia's full history, suggests that Aaron's thesis is simply not true in general. There could be any number of reasons he got a different result. For example he looked at only one article discussing a fairly well known person, which may not have been representative of the bulk of Wiki cotnent, or things may be different in English rather than say Russian (one of the wikis I used), or things may have changed since 2006. Whatever the reason, my conclusion is that the core, highly-active community (perhaps 25,000 accounts on a site the size of enwiki) contributes more than 50% of the currently displayed text.
To answer Aryeh, yes, I paid attention to handling vandalism reversions, and yes anons were tracked as if they were users.
I went into it expecting a result like that described in the blog post, and came out with the opposite conclusion.
-Robert Rohde
PS. A full write-up of this analysis has been on my to-do list for a while now.