On Wed, Mar 26, 2014 at 7:13 AM, Sage Ross ragesoss+wikipedia@gmail.comwrote:
That part of Amy Roth's research has not been published except on-wiki.[1] There was also a followup study after the Public Policy Initiative using the same method, which found found similar results.[2]
Thanks Sage, [1] is the page I was looking for, and had trouble finding.
It's true that these studies were manual processes that took a huge amount of work and wouldn't scale (at least, without some investing in tools for easy data generation and collection).
Yes -- the *research that produced* the finding was resource-intensive. But as Sage says, my point was it's the *results* of the research that have lasting value.
But I think Pete's point is that these studies show that the widespread Wikipedia 1.0 scale itself does a decent job of what it is intended to do.
[1] = https://outreach.wikimedia.org/wiki/Student_Contributions_to_Wikipedia [2] = https://en.wikipedia.org/wiki/Wikipedia:Ambassadors/Research/Article_quality...
On Tue, Mar 25, 2014 at 7:55 PM, Marc A. Pelletier marc@uberbox.org wrote:
On 03/25/2014 07:45 PM, John Mark Vandenberg wrote:
If nothing else, the existing community quality rating system (i.e. FA,
GA,
etc.) should be used.
Exactly. And the question of *exactly how* reliable that system is, is an important one, and one we are at least part way to answering.
That idea needs to be tempered with a strong caveat: at least for enwiki, those processes tend to be highly politized as they are already. Focusing strategy on those is likely to have volatile effects and any step in that direction has to be done deliberately and with a great deal of caution.
Very good point. (But, not so much caution that we fail to proceed at all!!!)
While this is true to some degree, if an article *does* make it
through one of these processes, that's a fairly reliable indicator that it's at least pretty decent.
Yes. The value of Amy's research is that we can say that with confidence -- not just as Wikipedians with anecdotal experience.
It is true that her study covered only the US Public Policy topic, and also I don't think it was ever published by a peer reviewed journal. But it does, if nothing else, lay out a clear path for future research. Amy also published her methodology.[3]
The bigger problem is that the lower ratings, between Stub and B-class, are often badly out-of-date.
<snip>
That is true, but it is also something that can be systematically evaluated by software. For instance, and automated process could "score" the currency of a rating by determining when the "class=" parameter was updated, and summarizing what has happened to the article since (bytes added, number of references added, sub-articles spun off, number of talk page comments.....) This would of course require further deliberation and design -- but it's a rich vein to mine.
The nice thing about the Wikipedia 1.0 ratings is that we already have
all the data, just waiting for someone to do something cool with it. The history of this page would be fun to play with: https://en.wikipedia.org/wiki/User:WP_1.0_bot/Tables/OverallArticles
-Sage
OK, a small pet peeve about that table: the "importance" axis is meaningless when aggregaged. If an an Academy Award winning film was set partially in Wisconsin, it might be rated "top" importance by WikiProject Film, but "low" importance by WikiProject Wisconsin. "Importance" in the abstract is a meaningless term; it's "importance in XYZ context" that matters. </rant>
-Pete
[3] https://outreach.wikimedia.org/wiki/Public_Policy_Initiative_Evaluation_and_...