Peter Jacobi schreef:
Do you see any chance of getting a similiar graph for the precentage of articles regarding fictional (persons, places, spaceships, ... everything)?
I've done some work of this recently, resulting in .
That image is in percentages of text volume (in bytes), but I also have the percentages of article numbers. Unfortunately no time series. For people, the numbers are close to those of Gregory: 10.8% living, 8.9% dead.
I've identified 7.2% of articles as a location; this is probably an underestimate. 4.2% is disambiguation; 3.4% albums and singles; 3.0% tree-of-life articles; 1.6% movies. Over 60% unclassified stuff. Suggestions for more categories *and how to recognize them* are welcome.
Technical details: these numbers are the percentages of non-redirect articles in the main namespace of articles matching one of the following [[regex]]en: - /[[[Cc]ategory:[Ll]iving people(||]])/ - /[[[Cc]ategory:[^]]+ (births|deaths)(||]])/ - /{{\s*[Cc]oor/ <-- This one is of very dubious quality - /{{[dD]isamb/ - /[[[Cc]ategory:\d+ (albums|singles)(||]])/ - /{{\s*[Tt]axobox\b/ - /[[[Cc]ategory:[^]]+ films(||]])/