The editing sample is based on a random number generator selection of
articles (namespace=0, page_is_redirect is false) in the September 8th dump
of the page table. So it excludes articles created in the last weeks of
September, but is otherwise a random sample of everything in article space.
-Robert
On 10/10/07, Kwan Ting Chan <ktc(a)ktchan.info> wrote:
On Wed, 2007-10-10 at 08:43 -0700, Steven Walling wrote:
You're trying to make an accurate judgment
based on 100k of articles
from a
2 million article field? Don't insult our
intelligence.
Exactly what is new with taking a sample of the whole to come up with
trends in statistics ?
Any concern should be whether 5% of 2 millions is a high enough sample
(I would say so), and whether the samples is representative of the whole
sample space (all the articles). The second I can't comment on as I do
not know what those 100K articles are among other reasons.
KTC
--
Experience is a good school but the fees are high.
- Heinrich Heine
_______________________________________________
WikiEN-l mailing list
WikiEN-l(a)lists.wikimedia.org
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l