Steve Bennett wrote:
On Sat, Dec 12, 2009 at 8:28 AM, Lars Aronsson lars@aronsson.se wrote:
The various (humorous) Size-of-Wikipedia pictures come to mind,
http://commons.wikimedia.org/wiki/File:Size_of_English_Wikipedia_broken_down... http://commons.wikimedia.org/wiki/File:Size_of_Wikipedia_broken_down.sv
Boring I am, but I would like to see updated versions of the non-satirical version. A year-on-year diagram would be interesting.
Exactly that thought made me introduce category:men and women in the Swedish Wikipedia in August 2008. It took 18 weeks to cover all 80,000 biographies. We found that biographies make up 28 % of all articles and there are 4 men to each woman. (The German Wikipedia already had such categories.) We also continued to add categories for birth and death year to those biographies that lacked such information, using the men/women categories as an identifier for biographies.
This year, another user introduced a category:Living people (for men/women that lack a category for year of death), and after six weeks we know that 42,000 or 45 % out of the now 93,600 biographies describe living people.
We now have enough data to draw a [[population pyramid]] for biographies in the Swedish Wikipedia. One such diagram is http://commons.wikimedia.org/wiki/File:LA2-gender-age.png
Note: This is very different from some languages of Wikipedia that have these categories (men, women, living people) without the *ambition to cover all biographic articles*. Without that ambition, you can't use the categories for piecharts or this kind of bookshelf diagram.
For other kinds of articles (non-biographies), such divisions are much harder to establish. The category system doesn't easily divide articles into such groups. Even within biographies, there is no really systematic division of professions, so that you can tell scientific people apart from political people, or left-wing from right-wing politicians. You can do this only for very limited groups of articles, such as biographies of U.S. senators from the Democratic and Republican party. For all articles, you can make an estimate by sampling random articles, but exactly which labels should your piechart show?
If the section "History of Aberdeen" is made into an article of its own, does that mean you have one more article on geography (a city in Scotland) or one more article on history?