On Tue, Mar 25, 2014 at 7:41 PM, Philippe Beaudette
<philippe(a)wikimedia.org> wrote:
I wouldn't know, Pete. But as I recall, it was a
manual process, wasn't
it? And therefore quite difficult to scale and/or adapt for some usages?
On Tue, Mar 25, 2014 at 4:35 PM, Pete Forsyth <peteforsyth(a)gmail.com> wrote:
> Philippe,
>
> The Public Policy Initiative produced strong validation for the Wikipedia
> 1.0 approach to assessing article quality. Was Amy Roth's research ever
> published, and are there any plans to repeat it with a larger sample size
> etc.? I'd say we're closer than you think to having a good way to measure
> article quality.
>
That part of Amy Roth's research has not been published except
on-wiki.[1] There was also a followup study after the Public Policy
Initiative using the same method, which found found similar
results.[2]
It's true that these studies were manual processes that took a huge
amount of work and wouldn't scale (at least, without some investing in
tools for easy data generation and collection). But I think Pete's
point is that these studies show that the widespread Wikipedia 1.0
scale itself does a decent job of what it is intended to do.
[1] =
https://outreach.wikimedia.org/wiki/Student_Contributions_to_Wikipedia
[2] =
https://en.wikipedia.org/wiki/Wikipedia:Ambassadors/Research/Article_qualit…
On Tue, Mar 25, 2014 at 7:55 PM, Marc A. Pelletier <marc(a)uberbox.org> wrote:
On 03/25/2014 07:45 PM, John Mark Vandenberg wrote:
If nothing else, the existing community quality
rating system (i.e. FA, GA,
etc.) should be used.
That idea needs to be tempered with a strong caveat: at least for
enwiki, those processes tend to be highly politized as they are already.
Focusing strategy on those is likely to have volatile effects and any
step in that direction has to be done deliberately and with a great deal
of caution.
While this is true to some degree, if an article *does* make it
through one of these processes, that's a fairly reliable indicator
that it's at least pretty decent. The bigger problem is that the lower
ratings, between Stub and B-class, are often badly out-of-date. For
both the process-based ratings (FA, A, GA) and the informally assigned
ones (Stub, Start, C, B), the ratings are probably best thought of as
a very approximate lower bound for quality. (That is, a "B-class"
article might really be GA quality and it just never went through the
process, or a "Stub" might actually have developed well into C-class
territory in the time since its rating was assigned.)
The nice thing about the Wikipedia 1.0 ratings is that we already have
all the data, just waiting for someone to do something cool with it.
The history of this page would be fun to play with:
https://en.wikipedia.org/wiki/User:WP_1.0_bot/Tables/OverallArticles
-Sage