I've just noticed the GNU diction package, which applies a number of "figure of merit" measures to text. Although the figures it generates are of debatable usefulness, it might be an interesting experiment to run it on a Wikipedia dump
Some questions: * Do the various readability scores rise as articles are revised? * Would it be worth creating special pages to find the best/worst articles by reading score, so they can be copyedited, or checked to see if they are suitable for being featured articles?
-- Neil
On 07/16/04 16:45, Neil Harris wrote:
I've just noticed the GNU diction package, which applies a number of "figure of merit" measures to text. Although the figures it generates are of debatable usefulness, it might be an interesting experiment to run it on a Wikipedia dump Some questions:
- Do the various readability scores rise as articles are revised?
- Would it be worth creating special pages to find the best/worst
articles by reading score, so they can be copyedited, or checked to see if they are suitable for being featured articles?
I don't know, but if there's a Web gateway where you can measure any text you paste into a text box, there's a few fans of grammatical overcomplexity on Wikipedia it would be useful for beating over the head ...
- d.
On Fri, 16 Jul 2004 19:25:05 +0000, David Gerard fun@thingy.apana.org.au wrote:
I don't know, but if there's a Web gateway where you can measure any text you paste into a text box, there's a few fans of grammatical overcomplexity on Wikipedia it would be useful for beating over the head ...
There are a few of these around. The one I use it at http://resources.aellalei.com/writer/sample.html and provides a Flesch Reading Ease, Fog Scale Level and Flesch-Kincaid Grade Level for any text pasted in. It's only suitable for individual articles though. It wouldn't be the best option for a large scale analysis.
Angela.
wikipedia-l@lists.wikimedia.org