G'day folks,
Nature has published an article on studies on Wikipedia editing patterns.
http://www.nature.com/news/2007/070226/full/070226-6.html
I have added some highlights of the article:
*Highly edited*
Instead, there is an abnormally high number of very highly edited entries. The researchers say this is just what is expected if the number of new edits to an article is proportional to the number of previous edits. In other words, edits attract more edits. The disproportionately highly edited articles, the researchers say, are those that deal with very topical issues.
And does this increased attention make them better? So it seems. Although the quality of an entry is not easy to assess automatically, Wilkinson and Huberman assume that those articles selected as the 'best' by the Wikipedia user community are indeed in some sense superior. These, they say, are more highly edited, and by a greater number of users, than the less visible entries.
Who is making these edits, though? Some have claimed that Wikipedia articles don't truly draw on the collective wisdom of its users, but are put together mostly by a small, select élite, including the system's administrators. Wikipedia co-founder Jimmy Wales has admitted that he spends "a lot of time listening to four or five hundred" top users.
Aniket Kittur of the University of California, Los Angeles, and co-workers have set out to discover who really does the editing2http://www.nature.com/news/2007/070226/full/070226-6.html#B2. They have looked at 4.7 million pages from the English-language Wikipedia, subjected to a total of about 58 million revisions, to see who was making the changes, and how.
The results were striking. In effect, the Wiki community has mutated since 2001 from an oligarchy to a democracy. The percentage of edits made by the Wikipedia 'élite' of administrators increased steadily up to 2004, when it reached around 50%. But since then it has steadily declined, and is now just 10% (and falling).
*Weight of numbers*
Even though the edits made by this élite are generally more substantial than those made by the masses, their overall influence has clearly waned. Wikipedia is now dominated by users who are much more numerous than the elite but individually less active. Kittur and colleagues compare this to the rise of a powerful bourgeoisie within an oligarchic society.
It concludes:
This diversification of contributors is beneficial, Ofer Arazy and colleagues at the University of Alberta in Canada have found3http://www.nature.com/news/2007/070226/full/070226-6.html#B3. In 2005, when *Nature*'s news team arranged for expert comparisons between articles in Wikipedia and Encyclopaedia Britannica online, the experts found only a moderate excess of errors in the Wikipedia articles4http://www.nature.com/news/2007/070226/full/070226-6.html#B4. (The idea that the two sources were broadly similar was vigorously challenged by the Encyclopaedia Britannica; see http://www.nature.com/nature/journal/v438/n7070/full/438900a.html and http://www.nature.com/news/2006/060327/full/440582b.html.) Arazy's team says that of the 42 Wikipedia entries assessed in the article, the number of errors decreased as the number of different editors increased.
The main lesson for tapping effectively into the 'wisdom of crowds', then, is that the crowd should be diverse. In fact, in 2004 Lu Hong and Scott Page of the University of Michigan in Ann Arbor showed that a problem-solving team selected at random from a diverse collection of individuals will usually perform better than a team made up of those who individually perform best — because the latter tend to be too similar, and so draw on too narrow a range of options5http://www.nature.com/news/2007/070226/full/070226-6.html#B5. For crowds, wisdom depends on variety.
Regards
*Keith Old*
**
On 2/27/07, Keith Old keithold@gmail.com wrote:
G'day folks,
Nature has published an article on studies on Wikipedia editing patterns.
http://www.nature.com/news/2007/070226/full/070226-6.html
I have added some highlights of the article:
*Highly edited*
Instead, there is an abnormally high number of very highly edited entries. The researchers say this is just what is expected if the number of new edits to an article is proportional to the number of previous edits. In other words, edits attract more edits. The disproportionately highly edited articles, the researchers say, are those that deal with very topical issues.
And does this increased attention make them better? So it seems. Although the quality of an entry is not easy to assess automatically, Wilkinson and Huberman assume that those articles selected as the 'best' by the Wikipedia user community are indeed in some sense superior. These, they say, are more highly edited, and by a greater number of users, than the less visible entries.
Who is making these edits, though? Some have claimed that Wikipedia articles don't truly draw on the collective wisdom of its users, but are put together mostly by a small, select élite, including the system's administrators. Wikipedia co-founder Jimmy Wales has admitted that he spends "a lot of time listening to four or five hundred" top users.
Aniket Kittur of the University of California, Los Angeles, and co-workers have set out to discover who really does the editing2http://www.nature.com/news/2007/070226/full/070226-6.html#B2. They have looked at 4.7 million pages from the English-language Wikipedia, subjected to a total of about 58 million revisions, to see who was making the changes, and how.
The results were striking. In effect, the Wiki community has mutated since 2001 from an oligarchy to a democracy. The percentage of edits made by the Wikipedia 'élite' of administrators increased steadily up to 2004, when it reached around 50%. But since then it has steadily declined, and is now just 10% (and falling).
*Weight of numbers*
Even though the edits made by this élite are generally more substantial than those made by the masses, their overall influence has clearly waned. Wikipedia is now dominated by users who are much more numerous than the elite but individually less active. Kittur and colleagues compare this to the rise of a powerful bourgeoisie within an oligarchic society.
It concludes:
This diversification of contributors is beneficial, Ofer Arazy and colleagues at the University of Alberta in Canada have found3http://www.nature.com/news/2007/070226/full/070226-6.html#B3. In 2005, when *Nature*'s news team arranged for expert comparisons between articles in Wikipedia and Encyclopaedia Britannica online, the experts found only a moderate excess of errors in the Wikipedia articles4http://www.nature.com/news/2007/070226/full/070226-6.html#B4. (The idea that the two sources were broadly similar was vigorously challenged by the Encyclopaedia Britannica; see http://www.nature.com/nature/journal/v438/n7070/full/438900a.html and http://www.nature.com/news/2006/060327/full/440582b.html.) Arazy's team says that of the 42 Wikipedia entries assessed in the article, the number of errors decreased as the number of different editors increased.
The main lesson for tapping effectively into the 'wisdom of crowds', then, is that the crowd should be diverse. In fact, in 2004 Lu Hong and Scott Page of the University of Michigan in Ann Arbor showed that a problem-solving team selected at random from a diverse collection of individuals will usually perform better than a team made up of those who individually perform best — because the latter tend to be too similar, and so draw on too narrow a range of options5http://www.nature.com/news/2007/070226/full/070226-6.html#B5. For crowds, wisdom depends on variety.
Regards
*Keith Old*
Excellent find. Thank you for posting that, Keith.
The full paper's preprint is up at: http://xxx.arxiv.org/PS_cache/cs/pdf/0702/0702140.pdf
On 2/27/07, George Herbert george.herbert@gmail.com wrote:
Excellent find. Thank you for posting that, Keith.
The full paper's preprint is up at: http://xxx.arxiv.org/PS_cache/cs/pdf/0702/0702140.pdf
(I think this can be answered based on the preprint's data and equations, but I'm not volunteering to figure out the answer.)
One of the central conclusions is "edits beget edits", though this is in terms of correlation, not causation. But assuming causation, how many edits does an edit beget? That is, as a function of total edits, time since creation, and number of new edits in a recent time slice, how many extra edits are edits are expected in the next time slice and beyond? If I make 50 edits in one day to a 1000-edit article of age 50 weeks, how many extra edits will others contribute the next day, and how many extra over the next year?
If someone comes up with the equation for that based on the paper's data, then we could do some measurements on the mechanisms by which edits beget edit. The important question is, do edits really beget edits, or is the correlation between number of new edits and number of total edits simply an artifact of Wikipedia's overall exponential growth coupled with the relationship between article age and article popularity (i.e., more important articles are created earlier).
I assume the answer is some of both. But it could be tested by individual editors choosing appropriate pairs of articles (of similar quality, time-since-creation, and total edits), and editing one heavily while leaving the other alone. For each pair, measure how many extra edits beyond that editor's work accrue to the edited one compared to the control.
-Sage
On 2/27/07, Sage Ross sage.ross@yale.edu wrote:
If someone comes up with the equation for that based on the paper's data, then we could do some measurements on the mechanisms by which edits beget edit. The important question is, do edits really beget edits, or is the correlation between number of new edits and number of total edits simply an artifact of Wikipedia's overall exponential growth coupled with the relationship between article age and article popularity (i.e., more important articles are created earlier).
I know that an edit to something on my watchlist draws my attention and, in the course of checking the edit, I may see other things that need fixing. This is a wholly unscientific observation, of course.
Adam
T P wrote:
On 2/27/07, Sage Ross sage.ross@yale.edu wrote:
If someone comes up with the equation for that based on the paper's data, then we could do some measurements on the mechanisms by which edits beget edit. The important question is, do edits really beget edits, or is the correlation between number of new edits and number of total edits simply an artifact of Wikipedia's overall exponential growth coupled with the relationship between article age and article popularity (i.e., more important articles are created earlier).
I know that an edit to something on my watchlist draws my attention and, in the course of checking the edit, I may see other things that need fixing. This is a wholly unscientific observation, of course.
Additional anecdotal evidence: Sometimes I hit the random button and find a page that has not been edited for many months. I may make a simple change, perhaps do a bit of wikifying or fix a spelling mistake, and put the page on my watchlist. Often times such a page will attract other changes from other editors. I assume that these others were watching recent changes, or had the page on their watchlist and their attention was drawn by my edit (as Adam recounts, above).
-Rich
On Feb 27, 2007, at 11:17 PM, Rich Holton wrote:
I assume that these others were watching recent changes, or had the page on their watchlist and their attention was drawn by my edit (as Adam recounts, above).
That happens to me quite often as well. An article in my watchlist that did not showed up there for months, suddenly lights up, I check the edit, add a new tidbit, a new source perhaps. Tthat attracts a few edits from other editors, only to disappear for a few months and so on...
-- Jossi
On 3/1/07, Jossi Fresco jossifresco@mac.com wrote:
On Feb 27, 2007, at 11:17 PM, Rich Holton wrote:
I assume that these others were watching recent changes, or had the page on their watchlist and their attention was drawn by my edit (as Adam recounts, above).
That happens to me quite often as well. An article in my watchlist that did not showed up there for months, suddenly lights up, I check the edit, add a new tidbit, a new source perhaps. Tthat attracts a few edits from other editors, only to disappear for a few months and so on...
This brings up an interesting statistical analysis problem - Do the database exports include any user info, such as userid/name and user watchlists? Is it possible to look at this factor in a public database export, or would someone have to do statistics analysis on one of the WMF MySQL mirrors?
On 2/28/07, T P t0m0p0@gmail.com wrote:
On 2/27/07, Sage Ross sage.ross@yale.edu wrote:
If someone comes up with the equation for that based on the paper's data, then we could do some measurements on the mechanisms by which edits beget edit. The important question is, do edits really beget edits, or is the correlation between number of new edits and number of total edits simply an artifact of Wikipedia's overall exponential growth coupled with the relationship between article age and article popularity (i.e., more important articles are created earlier).
I know that an edit to something on my watchlist draws my attention and, in the course of checking the edit, I may see other things that need fixing. This is a wholly unscientific observation, of course.
I agree that the Watchlist is likely the main vector through which edits become known to others and acted upon. Also don't discount the nudges people leave on Talk pages of others to explicitly notify others of a Wikiproject or set of edits.
I do take some issue with the analysis of the paper. Rather than say oligarchy becomes democracy, I think it's more accurate to say small democracy scaled up to become big democracy, and the power curve followed. There was never a "rule by a few" in the strict sense of an oligarchy. Even right from the start when Wikipedia got Slashdotted in 2001, it was welcoming newcomers to take up whatever role they identified themselves for.
This also shows the complete inadequacy of using "system of government" metaphors for trying to describe the dynamics of a peer-production environment. It's as useful as using kilograms to measure brightness.
-Andrew (User:Fuzheado)
On 2/28/07, Sage Ross sage.ross@yale.edu wrote:
One of the central conclusions is "edits beget edits", though this is in terms of correlation, not causation. But assuming causation, how many edits does an edit beget? That is, as a function of total edits, time since creation, and number of new edits in a recent time slice, how many extra edits are edits are expected in the next time slice and beyond? If I make 50 edits in one day to a 1000-edit article of age 50 weeks, how many extra edits will others contribute the next day, and how many extra over the next year?
There is also no real way to know whether those edits are "valuable" edits. One would hope that bad edits beget better edits, and that vandalism begets correction!
I must say that my primary objection here is the use of the term "democracy" as if it were unproblematic -- everybody has a different idea of what "democracy" means and stands for, and I'm not sure it is at all the right term to describe the functioning of Wikipedia (however rosy it sounds). If Wikipedia is a democracy then the term democracy has taken on a new and bizarre meaning. My wild guess is that the people who gave it that designation have not spent much time contemplating political theory, or even understanding the dynamics of complex organizations. My anecdotal experience would lead me to think that different areas of Wikipedia -- editing high-visibility articles, editing low-visibility articles, arbitrating article content disputes, arbitrating user disputes, article or image reviews, etc. -- function in totally different ways, and since many of these functions are *not* regulated by any strict set of guidelines I would not be surprised to find all sorts of slightly different ad hoc dynamics involved in each realm which become somewhat self-reinforcing in respects to the type of people who frequent each (I would be interested to know if there is much overlap between people who edit heavily, people who write new articles heavily, people who help out at the reference desk, and people who vote for article deletion). But anyway I haven't juggled any data so I admit to just playing the armchair methodologist.
FF
On 3/3/07, Fastfission fastfission@gmail.com wrote:
On 2/28/07, Sage Ross sage.ross@yale.edu wrote:
One of the central conclusions is "edits beget edits", though this is in terms of correlation, not causation. But assuming causation, how many edits does an edit beget? That is, as a function of total edits, time since creation, and number of new edits in a recent time slice, how many extra edits are edits are expected in the next time slice and beyond? If I make 50 edits in one day to a 1000-edit article of age 50 weeks, how many extra edits will others contribute the next day, and how many extra over the next year?
There is also no real way to know whether those edits are "valuable" edits. One would hope that bad edits beget better edits, and that vandalism begets correction!
That's the important question from the responsible editor's perspective, though: to what degree do good edits beget more good edits? Anecdote and personal experience suggest "some", but it could be tested in a very rough way by taking (a large number of) paired articles of about equal standing and working on one but not the other.
I must say that my primary objection here is the use of the term "democracy" as if it were unproblematic -- everybody has a different idea of what "democracy" means and stands for, and I'm not sure it is at all the right term to describe the functioning of Wikipedia (however rosy it sounds). If Wikipedia is a democracy then the term democracy has taken on a new and bizarre meaning. My wild guess is that the people who gave it that designation have not spent much time contemplating political theory, or even understanding the dynamics of complex organizations. My anecdotal experience would lead me to think that different areas of Wikipedia -- editing high-visibility articles, editing low-visibility articles, arbitrating article content disputes, arbitrating user disputes, article or image reviews, etc. -- function in totally different ways, and since many of these functions are *not* regulated by any strict set of guidelines I would not be surprised to find all sorts of slightly different ad hoc dynamics involved in each realm which become somewhat self-reinforcing in respects to the type of people who frequent each (I would be interested to know if there is much overlap between people who edit heavily, people who write new articles heavily, people who help out at the reference desk, and people who vote for article deletion). But anyway I haven't juggled any data so I admit to just playing the armchair methodologist.
Yeah, the newspaper headline version of the results is hardly worth taking seriously. But the numerical results behind that are interesting nonetheless, especially as a complement to Aaron Swartz's "Who Writes Wikipedia" from several months ago. I just wish they would have presented the data in terms that are meaningful for individual articles without crunching the numbers.
-Sage
On 3/3/07, Sage Ross ragesoss+wikipedia@gmail.com wrote:
On 3/3/07, Fastfission fastfission@gmail.com wrote:
On 2/28/07, Sage Ross sage.ross@yale.edu wrote:
One of the central conclusions is "edits beget edits", though this is in terms of correlation, not causation. But assuming causation, how many edits does an edit beget? That is, as a function of total edits, time since creation, and number of new edits in a recent time slice, how many extra edits are edits are expected in the next time slice and beyond? If I make 50 edits in one day to a 1000-edit article of age 50 weeks, how many extra edits will others contribute the next day, and how many extra over the next year?
There is also no real way to know whether those edits are "valuable"
edits.
One would hope that bad edits beget better edits, and that vandalism
begets
correction!
That's the important question from the responsible editor's perspective, though: to what degree do good edits beget more good edits? Anecdote and personal experience suggest "some", but it could be tested in a very rough way by taking (a large number of) paired articles of about equal standing and working on one but not the other.
More anecdotes backed up by gut feelings...
I think the "edits beget edits" experience is real. Of course, there's vandalism and reversion, and the fact that some articles attract a lot of vandalism for a little while, or a lot of vandalism episodically (I'm not talking about [[George W. Bush]] here). But then there are those articles that sit on your watchlist - things you came across, maybe made a few changes, and then watchlisted, meaning to come back later. If other people are like me in this regard (and looking at public "to do lists", I'm guessing they are), Wikipedia is full of low quality articles you meant to get back to once you had the time to look up that one reference...and never did. Or articles that desperately needed work, but you didn't have the time or didn't feel like dealing with it at the time.
Once someone starts working on one of these "dormant" articles, you come over and have a look. That's especially true if they make several edits. And you start to chime in - adding a reference, fixing a typo, correcting a fact. And this might attract another editor. Sometimes it's a bad edit, or vandalism that attracts you. Sometimes it's a biased insertion. More often that not, it's a one-off thing that's soon forgotten (or, worse yet, added to a "to do" list). But sometimes it attracts a swarm of editors.
I'd agree with the assertion that "edits beget edits", but (I think) primarily on articles that have already been edited by several people. Activity that only shows up on "Recent Changes" doesn't work the same way - the people watching that are looking for vandalism. So my hypothesis is that non-vandalism edits beget other edit on pages that have been edited by at least a few (5, to pick an arbitrary number) established editors.
Ian
On 2/28/07, Keith Old keithold@gmail.com wrote:
G'day folks,
Nature has published an article on studies on Wikipedia editing patterns.
<snip>
They have looked at 4.7 million pages from the English-language Wikipedia, subjected to a total of about 58 million revisions, to see who was making the changes, and how.
When I first read this I was very confused since we don't have that many articles. Did they look at non-article pages? Seems unlikely. Then I read the abstract of the study which states:
"Despite the apparent lack of order, the 50 million edits by 4.8 million contributors to the 1.5 million articles in the English-language Wikipedia follow strong certain overall regularities."
It seems like the Nature people just confused the numbers (4.7 and 4.8 are very close after all). Seems like a silly mistake :)
Can I say it?!?! PLEASE!! I want to say it! Ok, I'm saying it!
Had this article been.....
All right, I won't say it :(
--Oskar