Joseph Reagle wrote:
On Friday 06 November 2009, Michael Ekstrand wrote:
While not the primary result, we include charts
of the proportion
of revisions which are accepted. The metric should be applicable
for a more rigorous study of unproductive vs. productive work.
Since figure 15 seems to have a lower bound at 0.82, would it be safe
to a further that at least 82% of contributions are productive
(i.e., accepted by the community/k-acceptance)?
Short answer: probably, but that's based on a number of assumptions that
may or may not hold.
As always, the devil is in the details, but your statement does seem to
be likely to be true assuming that editors will continue to accept
articles at roughly the same rate as they did up to January 2008. We do
not know right now how acceptance behavior changes with time, though, so
that assumption may or may not be true. Also, there are cases where
which k-acceptance will incorrectly classify. Some of the false
positives are discussed in the paper. It is also possible for an edit
to be reverted and then re-made in a possibly altered form later, in
which case it will be detected as rejected. In using k-acceptance as a
metric for productivity, this gets more complicated as indirect
influence may be a factor as well (e.g. when a rejected edit inspires a
later accepted edit; there is an example of this in [Krip07]).
The other confounding factor is edits to very-low-traffic articles.
Edits languishing in stub articles that no one edits don't have an
opportunity to be accepted by this metric (since acceptance depends on
other editors editing the article without reverting the change in
question). Those edits factor into the "undefined" classification in
our metric; I do not know off-hand how many revisions are affected by
this, or what the net impact of this effect is.
It should be possible to resolve these with a careful analysis of the
edge cases (investigating the impact of stub and other short-history
articles and possibly doing human-coding of other cases to see how often
k-acceptance misses what you're looking for in "productivity"). So I
think that this metric can be a starting point for what you're getting
at, but more work is needed to validate the assumptions in order to make
statements like that with confidence. Fortunately, most of the
adjustments made by resolving these assumptions should increase the
percentage of edits classified as productive.
Michael Ekstrand <ekstrand(a)cs.umn.edu>
Ph.D student, Computer Science -- University of Minnesota
GroupLens Research: http://www.grouplens.org
Confused by odd attachments? See http://www.elehack.net/resources/gpg
window manager, n: a program for arranging multiple Emacs frames