On Tue, Mar 25, 2014 at 7:41 PM, Philippe Beaudette philippe@wikimedia.org wrote:
I wouldn't know, Pete. But as I recall, it was a manual process, wasn't it? And therefore quite difficult to scale and/or adapt for some usages?
On Tue, Mar 25, 2014 at 4:35 PM, Pete Forsyth peteforsyth@gmail.com wrote:
Philippe,
The Public Policy Initiative produced strong validation for the Wikipedia 1.0 approach to assessing article quality. Was Amy Roth's research ever published, and are there any plans to repeat it with a larger sample size etc.? I'd say we're closer than you think to having a good way to measure article quality.
That part of Amy Roth's research has not been published except on-wiki.[1] There was also a followup study after the Public Policy Initiative using the same method, which found found similar results.[2]
It's true that these studies were manual processes that took a huge amount of work and wouldn't scale (at least, without some investing in tools for easy data generation and collection). But I think Pete's point is that these studies show that the widespread Wikipedia 1.0 scale itself does a decent job of what it is intended to do.
[1] = https://outreach.wikimedia.org/wiki/Student_Contributions_to_Wikipedia [2] = https://en.wikipedia.org/wiki/Wikipedia:Ambassadors/Research/Article_quality...
On Tue, Mar 25, 2014 at 7:55 PM, Marc A. Pelletier marc@uberbox.org wrote:
On 03/25/2014 07:45 PM, John Mark Vandenberg wrote:
If nothing else, the existing community quality rating system (i.e. FA, GA, etc.) should be used.
That idea needs to be tempered with a strong caveat: at least for enwiki, those processes tend to be highly politized as they are already. Focusing strategy on those is likely to have volatile effects and any step in that direction has to be done deliberately and with a great deal of caution.
While this is true to some degree, if an article *does* make it through one of these processes, that's a fairly reliable indicator that it's at least pretty decent. The bigger problem is that the lower ratings, between Stub and B-class, are often badly out-of-date. For both the process-based ratings (FA, A, GA) and the informally assigned ones (Stub, Start, C, B), the ratings are probably best thought of as a very approximate lower bound for quality. (That is, a "B-class" article might really be GA quality and it just never went through the process, or a "Stub" might actually have developed well into C-class territory in the time since its rating was assigned.)
The nice thing about the Wikipedia 1.0 ratings is that we already have all the data, just waiting for someone to do something cool with it. The history of this page would be fun to play with: https://en.wikipedia.org/wiki/User:WP_1.0_bot/Tables/OverallArticles
-Sage