[WikiEN-l] The Statistical Decline of the English Wikipedia Community

Andrew Gray shimgray at gmail.com
Wed Oct 10 19:36:34 UTC 2007


On 10/10/2007, Anthony <wikimail at inbox.org> wrote:
> And since you're not including redirects, there's also a (potentially
> large) bias against articles which were heavily edited in the past and
> then later turned into redirects.

Yeah, this is an interesting one.

Firstly, we're ignoring a population of pages which have edit history
but no current content - much the same issue as with deleted pages.

Secondly, though, we have a problem that doesn't occur from deleted
pages - merges. When an article is redirected, it's often because it's
been merged into another page; the text is copied over and a note left
in the history. This is fair enough for our purposes, but for
automated analysis like this it causes a glitch; the multiple edits
over a long period which created that text aren't considered, and we
end up perceiving the content as created at another page, at a much
later date, in a single edit, by (probably) an unrelated user.

So how many merges are there? Difficult question. Juggling some
numbers I'd guess about 0.5-1% of our total pages have at some point
been merged...

-- 
- Andrew Gray
  andrew.gray at dunelm.org.uk



More information about the WikiEN-l mailing list