On Sun, Jul 29, 2012 at 10:17 PM, Tilman Bayer tbayer@wikimedia.org wrote:
Of course, here the term "high quality" does not necessarily mean, say, featured content (e.g. on the English Wikipedia, featured articles currently make up less than 0.1% of the total articles), but instead refers to comparisons with average contributions.
Someone from the Education Program will be able to give a more thorough overview of the efforts to evaluate its results, but for example I'm aware of
https://blog.wikimedia.org/2012/04/19/wikipedia-education-program-stats-fall... . The quantitative method used there has its limitations, but similar methods are employed in independent (i.e non-WMF) research about Wikipedia in the academic literature.
It certainly does have limitations. Let's look at what it says:
---o0o---
In the Wikipedia Education Program, professors assign their students to edit Wikipedia articles as a grade for class, assisted by volunteer Wikipedia Ambassadors. In fall 2011, 55 courses participated in the program in the United States, with students editing articles on the English Wikipedia. On average, these students added 1855 bytes of content that stayed on Wikipedia, compared to only 491 for a randomly chosen sample of new users who joined English Wikipedia in September 2011. These numbers establish that students who participate in the Wikipedia Education Program contribute significantly more quality content that stays on Wikipedia than other new users.
---o0o---
Apart from John's very salient question about how the random sample of editors was selected, another very obvious issue is the traffic the edited pages attract. A random sample of users might include contributors to very popular and heavily edited pages, while students' edits are more likely to be to specialised pages on scholarly niche topics that get very few views, and attract few edits.
Content on little watched pages always stays longer than content on highly watched pages with a high edit turnover. This is quite irrespective of edit quality. Just look at some Wikipedia pages on Indian villages ... their content is crap, with outstanding long-term stability. :)
So until the analysis also factors in page viewing statistics and average edits per month on each page, the variables are hopelessly confounded, and the conclusions are nothing but wishful thinking (not to say lying with statistics).
In other words, it's impossible to conclude that content staying on Wikipedia is a reflection of edit quality, rather than a reflection of said content being on a very obscure page that no one reads or edits.
If the Foundation has an interest in producing meaningful statistical analyses, I would suggest actually employing a statistician who can give such posts a look-over and point out the obvious fallacies.
http://meta.wikimedia.org/wiki/Research_talk:Wikipedia_Education_Program_eva...