[Reposting without thread history, to get past mailing list filter]
From: Erik Zachte [mailto:ezachte@wikimedia.org]
Sent: Friday, October 04, 2013 5:04 PM
To: 'A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics.'
Cc: 'Jake Orlowitz'; 'Anthony Cole'; 'James Heilman'; 'Wiki
Medicine
discussion'; 'Matthew Roth'
Subject: RE: [Analytics] need traffic data for health content...
Hi Lane,
Did you see these reports?
Here is a category tree below category 'Health' on English Wikipedia (with
some out-of-context sub branches blacklisted).
http://stats.wikimedia.org/wikimedia/pageviews/categorized/wp-en/2013-07/cat
egories_wp-en_cat_Health_2013-07.html
Here are the page views for articles in all those categories:
Warning the list is overly complete by design:
Some top ranking titles in this list may seem out of place.
Please note that any Wikipedia article can have tens of categories assigned
to it.
A popular article will rank high in any list where it's featured, regardless
of the category under review.
Thus a well-known singer may be top ranking in a list about politicians,
because he/she also played a minor or brief role in politics.
Iterative pruning of the category tree will yield better results. Now you
have to do final filtering yourself.
http://stats.wikimedia.org/wikimedia/pageviews/categorized/wp-en/2013-07/pag
eviews_wp-en_cat_Health_2013-07.html
New insight:
Instead of using the category hierarchy, article lists from WikiProjects
would yield cleaner results, and would suffice for many purposes, notably
yours :-)
Cheers,
Erik