Categories are done by hand, at most one could write a bot that looked for
infobox or introduction text containing date of birth/death and
automatically add the category if it didn't exist, but as a rule it seems
that if someone's died then a date of death is usually there and usually so
are the categories you'd need.
The easiest and most exact way would be a database query, which could look
for *"born * died * 1941"* or just *"died * 1941"* in the first
paragraph,
and also that at least one word like *wrote / author / poet / painter*
or *{{infobox
person}}* in the text, or *"novelists | writers | painters |
authors..."*appear in at least one category. That should do exactly
what you need but
you'll need to find someone to set up and run the query for you.
If not, then these other options might help somewhat......
(1) Biographies will often start like this: *NAME (born 18 May 1862, died
17 June 1941, Sweden) was a.....*
So you could search for articles with the words *died 1941* in them.
Trouble is there are many reasons an article could have those words.
Limiting it to biographical articles might help. Some search engines allow
you to search for pages where the specific words appear close together but
Wikipedia's search doesn't have that feature, or not yet. Even so this
search does turn up useful results, especially combined with the *
incategory:* operator. You can also narrow down by adding words that
copyright creators are likely to have, such as "author" "playwright",
"poet" "artist" etc. Try these searches:
died
1941<http://en.wikipedia.org/w/index.php?title=Special%3ASearch&sear…
born died
1941<http://en.wikipedia.org/w/index.php?title=Special%3ASearch&sear…
(biographies with "died" will usually also have "born", use this to
narrow
down)
died 1941
author<http://en.wikipedia.org/w/index.php?title=Special%3ASearch&se…
(not so helpful)
born died 1941
wrote<http://en.wikipedia.org/w/index.php?title=Special%3ASearch&sea…
(adding one "copyright-creator" word seems to work, just. Adding more seems
to confuse things)
died 1941 incategory:"Polish
writers"<http://en.wikipedia.org/w/index.php?title=Special%3ASearch…
(but doesn't pick up articles nested in subcategories)
(2) Google has proximate word searching and can be told to list content
from just one site. All Wikipedia articles are indexed on Google. But it's
very limited in what it will show you and can't detect other things needed
to narrow it down. Try this in Google search:
born * died * 1941
site:en.wikipedia.org<https://www.google.com/search?num=100&hl=en&am…
(3) A third option which will pick up names of articles (but no further
details) is this category search tool:
http://toolserver.org/~magnus/catscan_rewrite.php which lets you enter a
category and search several layers deep. So nested categories will show up.
Try entering "Novelists" under "categories" and "3" under
"depth".
There may be other ways, such as common terms that only appear in
biographies. Perhaps someone else will have ideas.
FT2
On Fri, Dec 23, 2011 at 5:15 PM, Alek Tarkowski <
atarkowski(a)centrumcyfrowe.pl> wrote:
Jeremie, FT2,
thank you very much for your advice.
Do you have any idea how complete these lists are? Are they done by
hand, or is there a bot compiling these lists? And in any case, is there
any way to estimate how completely they cover a given category?