I think this conversation isn't going anywhere useful because everyone is using the
same words but with different meanings. In particular "quality" ...
Edits can do a range of things (and often more than 1). The edits might relate to:
* the information content of an article (add/edit/remove propositions -- facts if you
prefer -- about the topic, e.g. Fred Smith was born in 1770)
* the references (add/edit/delete the external sources that support a proposition)
* the presentation of the article, e.g. Structure of the article, spelling, grammar,
appropriately nuanced selection among synonyms, clear prose, conformance to the Manual of
Style, wikifying, etc
Each of these can have some kind of quality metrics attached (although most will be
somewhat subjective -- "an article in the New York post is a better quality source
than ...").
At the moment many in this conversation are using GA as the only quality metric. But I
think we should see this as a goal not a binary "quality / not quality" metric.
To achieve GA, clearly you need facts, verification, and presentation all in both quantity
and quality.
Next who is an IP? Well, we know that IPs don't necessarily map to individual people
and individual people do not map to a single IP. An IP edit might be done by someone who
is a registered user (but too lazy to login -- I'm guilty of that), who may later
become a registered user, or who may never be a registered user.
I postulate that good faith IP edits are predominantly small edits of facts or localized
edits of presentation (eg spelling). I postulate edits of logged-in users would be both
large and small and involve facts, references and presentation, although clearly
individual users may have their own particular profiles of edit behavior.
In particular to get an article to GA, you need one (or just a few people) to polish the
writing (presentation). Getting a super-readable document with many voices is very
difficult. Therefore I would expect that the final push to achieve GA would inevitably
involved registered users and not IPs.
Also GA status is a concept and process that is very much "insider" knowledge
about WP. Anonymous editors and low-activity editors are unlikely to have even heard about
GA status so therefore are not going to be working toward it. Only the very active
insiders would see it as their goal and therefore work towards it. So I think it is
pointless to discuss contribution to quality in terms of who gets an article to GA
status.
I think we do better to ask the question about the quality of an edit (or the set of edits
done by a particular user) in terms of whether it adds "correct" information,
references that support information, improves presentation. If someone adds a
"fact" and that edit is later obliterated by a rewrite of a section but the
information is retained (albeit in a different presentation), the original edit was still
good quality even if it doesn't survive as a string of characters. I think the use of
"edit survival" to measure the quality of an edit is failing to distinguish
between information content and presentation, but I acknowledge that "edit
survival" is easily measured and "information content survival" is not, but
be cautious about using one as the proxy for another.
I think qualitative assessment of a set of randomly selected articles which analyses the
contribution made by each individual edit in terms of:
* the quality of the article as it was immediately before and after the edit (immediate
contribution)
* the quality of the article as it is today (overall contribution)
Is more likely to come up with better answers to the question of the contributions of
anonymous edits, relative to low activity user editors, relative to high activity user
editors.
For the purposes of this conversation, I am ignoring vandalism (and other bad faith
behaviour) and edits to reverse them.
Sent from my iPad
On 01/11/2012, at 10:08 AM, Laura Hale <laura(a)fanhistory.com> wrote:
On Thu, Nov 1, 2012 at 9:14 AM, Piotr Konieczny <piokon(a)post.pl> wrote:
I agree, having a high number of edit does not signify creating high quality content - it
may only attest to the high use of semi-automated tools for minor edits.
I also don't dispute that anon's can contribute high quality content, and they do
a lot of edits. My point was:
* anon's don't contribute significantly to most content on Wikipedia that gets
peer reviewed (as Pierre noted, by that time they've probably registered anyway);
* hence majority of Wikipedia's GA+ content is not written by anonymous editors (but
the GA+ content is only a small percentage of Wikipedia's total content);
Do you have any evidence for anons don't contribute significantly to content that
gets peer reviewed? The reason it would appear they are not involved in processes is
because more often than they expressly prohibited from doing so. The implication here
could be: IP addresses are contributing GA level content but regular contributors are not
monitoring articles where IP addresses are doing lots of work and regular contributors are
not supporting taking of the work to the highest level.
http://toolserver.org/~daniel/WikiSense/Contributors.php?wikilang=en&wi…
is one of the more active articles (which is admittedly crap) with a high IP address
ratio. There are several highly active Wikipedia editors contributing to it. 463 of the
749 editors are IP addresses. Still, total edits by registered editors outnumbers
unregistered editors with 1,150 total edits to 1,175. Despite this, the volume of
contributors are not actually resulting in edits that work towards improving assessment.
A better analysis could be something like this: IP addresses are more likely to represent
a large editing population on an article that has higher visibility and more traffic. The
quality of the contributions to these articles is universally poor for registered and
unregistered users. At the same time, wikipedia processes favour articles that have less
visibility and where there is less inherent conflict. The necessity of covering a topic
comprehensively also serves as a barrier to taking these higher visibility articles to GA
as this is a challenge, and serves as a discouraging factor for taking an article through
processes. GA, Peer Review and FAC favour more narrow topics that are less visible and
get less traffic. This type of article is likely to have a much small editing pool, and
less likely to be found by IP address editors. (Example: Tennis articles have more IP
address edits than articles about sport shooting.) This means IP addresses are less
likely to be actively contributing to these articles. As processes implicitly lock them
out, there is little reason for these users to improve per guidelines on these less
visible articles.
Sincerely,
Laura Hale
--
twitter: purplepopple
blog:
ozziesport.com
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l