Apologies for being somewhat late to the party, our upcoming CSCW 2015 paper (coming soon to a research outlet near you!) took my attention, which is kind of ironic, as in that paper our primary method of assessing quality is a machine learner (we also use human assessments to confirm our results).

Earlier in the discussion, Aaron pointed to our WikiSym '13 paper[1]. Two aspects of article quality that has been brought up in this discussion were also on our mind when doing that work. First, readability: Stvilia et al[2] used Flesch-Kincaid[3] as part of one of their metrics. In my work I've found that it's not a particularly useful feature, it doesn't really help discern the quality of an article.

Secondly, information about editors, e.g. edit counts, tenure, etc… These features will typically help, for instance having a diverse set of editors working on an article is associated with higher quality. But, as we argue in our 2013 paper, that is not a feature that it's easy to change, nor something that it's easy to help someone change. Same goes for a few other features from the literature, e.g. number of edits or mean edits per day ("you should stop using the preview button and save all changes, even the small ones, because that'll increase the quality of the article"). Instead we argue for using features that editors can act upon, and then feed those back into SuggestBot's set of article suggestions to assist editors in finding articles that they want to contribute to.

Lastly, I'd like to mention that determining whether an article is high-quality or not is a reasonably simple task, as it's a binary classification problem. This is where for instance word count or article length have been shown to work well. Nowadays I find the problem of assessing quality on a finer-grained scale (e.g. English Wikipedia's 7-class assessment scale[4]) to be more interesting.

But, as James earlier touched on, "quality" is a many-faceted subject. While computer approaches work well for measures like amount of content, use of images, or citations, determining if the sources used are appropriate is a much harder task.

Footnotes:

1: Warncke-Wang, M., Cosley, D., & Riedl, J. (2013, August). Tell me more: an actionable quality model for Wikipedia. In Proceedings of the 9th International Symposium on Open Collaboration (p. 8). ACM. http://opensym.org/wsos2013/proceedings/p0202-warncke.pdf

2: ASSESSING INFORMATION QUALITY OF A COMMUNITY-BASED ENCYCLOPEDIA, by Stvilia, Twidale, Smith, and Gasser, 2005 http://mitiq.mit.edu/ICIQ/Documents/IQ%20Conference%202005/Papers/AssessingIQofaCommunity-basedEncy.pdf

3: https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests

4: With the exception of A-class articles, as they're practically nonexistent, and since they by definition are "complete", just like Featured Articles, they shouldn't be A-class articles for long.

Regards,

Morten