[WikiEN-l] Rating the English wikipedia

Ian Woollard ian.woollard at gmail.com
Tue Feb 15 04:00:05 UTC 2011


On 14/02/2011, David Gerard <dgerard at gmail.com> wrote:
> On 14 February 2011 20:48, Gwern Branwen <gwern0 at gmail.com> wrote:
>> Perhaps
>> http://en.wikipedia.org/wiki/User:Piotrus/Wikipedia_interwiki_and_specialized_knowledge_test

Oh riiiiiiiiiiiiiiiiiiiight. So back in 2006, Piotrus claims that
there should be 400 million articles.

It turns out he based this essentially only on biographies. In Poland.

Quick sanity check: that's about one bio article for every twentieth
person alive on the entire planet. And these would be encyclopedically
*notable* people would they?

We can easily see that that's not going to happen, even allowing for
the fact that lots of people have died already, most people just
aren't that notable, and the current population completely swamps
historical populations.

OK, so how did this happen? So I checked back through the history of
the article. The first claim was that it essentially needs 400 million
biographies of people. It turns out that the 400 million was based on
dividing 30 into 1000 to get 0.3% and then dividing that into the
biographies in the English Wikipedia. But... 30 in 1000 is 3%. So he's
already out by a factor of 10. That's bad enough. So now we're down to
40 million.

His next error is assuming that the English Wikipedia is off by a
factor of 33 on its biographies *worldwide*, as opposed to having a
blind patch on Poland.

So let's look at this. The biographical encyclopedia that he mentions
has 25,000 entries. Poland has 38 million people. So less than 1
person in a thousand is notable in Poland according to this
encyclopedia.

I then checked the British biography 'Who's who'. They have about
30,000 entries, but that's only about 1 person in 2000 in Great
Britain, so even less.

But again, roughly 1 person in 1000.

The world population is currently about 7 billion.

So if it's as high as 1 in a 1000 then that's about 7 million
articles, and to be honest in reality it's probably a *lot* less, a
lot of people globally do things like subsistence level farming, and
are thus far less likely to be notable. So even that is excessively
favourable.

I would guess we're looking at a few million biographies needed,
worldwide at the very most. And sure, there's probably other
biographical encyclopedias out there, and they may list a few more
that Who's who misses, but that kind of thing depends on notability as
to whether they'd survive AFDs in a general encyclopedia.

Anyway, so I stop there. Even 40 million appears completely
unsupportable. It looks like it's off again by about another order of
magnitude.

So, to sum up, this article's claim of 400 million is just based on
simple and obvious arithmetic logical errors, and seems to be two
orders of magnitude too high.

> - d.

-- 
-Ian Woollard



More information about the WikiEN-l mailing list