I must say I'm pretty dubious about this approach for articles. I doubt it can detect most of the typical problems with them - for example all-online sources are very often a warning sign, but may not be, or may be inevitable in a topical subject. Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show.
Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits. Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text.
There is also the question of what use the results of the exercise will be. Our current quality ratings certainly have problems, but are a lot better than nothing. However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings, are relatively few. An automated system is hard to argue with, & I'm concerned that such ratings will actually cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them. One issue with the manual system is that it tends to give a greatly excessive weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't. It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland https://en.wikipedia.org/wiki/WikiProject_Friesland and Wikiproject:Science.
John
Date: Wed, 16 Apr 2014 19:53:20 +0100 From: Charles Matthews charles.r.matthews@ntlworld.com
There's the old DREWS acronym from How Wikipedia Works. to which I'd now add T for traffic. In other words there are six factors that an experienced human would use to analyse quality, looking in particular for warning signs.
D = Discussion: crunch the talk page (20 archives = controversial, while no comments indicates possible neglect) R = WikiProject rating, FWIW, if there is one. E = Edit history. A single editor, or essentially only one editor with tweaking, is a warning sign. (Though not if it is me, obviously) W = Writing. This would take some sort of text analysis. Work to do here. Includes detection of non-standard format, which would suggest neglect by experienced editors. S = Sources. Count footnotes and so on. T = Traffic. Pages at 100 hits per month are not getting many eyeballs. Warning sign. Very high traffic is another issue.
Seems to me that there is enough to bite on, here.
Charles
--- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com