[WikiEN-l] The percentage of English Wikipedia articles about living people over time.

Gregory Maxwell gmaxwell at gmail.com
Tue Oct 16 22:54:17 UTC 2007


On 10/16/07, phoebe ayers <phoebe.wiki at gmail.com> wrote:
> So to make sure I understand properly, this is saying that currently,
> roughly 11% of all articles in the English Wikipedia are biographies of
> living people?

Yes: 11% of non-redirect articles pages are tagged with Category:Living people.

Since not all redirect pages are articles (90k disambigs, etc) but all
the Living people pages should be articles the actual concentration is
somewhat greater than 11%.

> If so, yikes.
>
> I'm curious how much of the article count biographies in general (of either
> living or dead people) make up as well. I guess you'd have to follow the
> category tree of the (much less well known) [[Category:Dead_people]] to
> figure that out and add the two up ... the upper-level people categories
> seem a bit disorganized.

I'm curious about this too, but ideas that include the words "follow
the category tree" are generally complete non-starters if you care
about remotely sane results.

I really wanted to break all of Wikipedia down into a dozen or so top
level categories so I can make a stacked line graph showing the
composition over time... but I've found no way of breaking up the
articles using automated category analysis that doesn't produce
utterly rubbish results.

I haven't looked specifically at doing that to identify dead people
articles... and I will... but I do not have high hopes. My past
experience suggests that the results will be nearly useless.

> Re: Rambot -- it's a self-fulfilling prophecy :) if you have articles
> about places, then clearly you need articles about people who live in
> those places... right?

I'm sure that this is a sub-subject worthy of a research paper on its
own. Some kind of spontaneous symmetry breaking? "What you lack is
what you get"  becomes "What you're getting you get more of" which
becomes "What you've got you get more of"? ;)



More information about the WikiEN-l mailing list