On 10/10/07, charles.r.matthews@ntlworld.com charles.r.matthews@ntlworld.com wrote:
Actually a 5% sample is large enough, if the numbers involved are large enough.
More importantly if the sample is diverse enough.
On 10/9/07, Robert Rohde rarohde@gmail.com wrote:
Given the lack of any recent official stats, I set out to generate my own using a dump of the Wikipedia log files and by systematically downloading (over many days) the history page contents for 100,000 articles.
I assume these were selected more or less at random?
—C.W.