[WikiEN-l] The Statistical Decline of the English Wikipedia Community
Andrew Gray
shimgray at gmail.com
Wed Oct 10 19:36:34 UTC 2007
On 10/10/2007, Anthony <wikimail at inbox.org> wrote:
> And since you're not including redirects, there's also a (potentially
> large) bias against articles which were heavily edited in the past and
> then later turned into redirects.
Yeah, this is an interesting one.
Firstly, we're ignoring a population of pages which have edit history
but no current content - much the same issue as with deleted pages.
Secondly, though, we have a problem that doesn't occur from deleted
pages - merges. When an article is redirected, it's often because it's
been merged into another page; the text is copied over and a note left
in the history. This is fair enough for our purposes, but for
automated analysis like this it causes a glitch; the multiple edits
over a long period which created that text aren't considered, and we
end up perceiving the content as created at another page, at a much
later date, in a single edit, by (probably) an unrelated user.
So how many merges are there? Difficult question. Juggling some
numbers I'd guess about 0.5-1% of our total pages have at some point
been merged...
--
- Andrew Gray
andrew.gray at dunelm.org.uk
More information about the WikiEN-l
mailing list