* MZMcBride wrote:
I've been asked a few times recently about doing
reports of the most-viewed
pages per month/per day/per year/etc. A few years after Domas first started
publishing this information in raw form, the current situation seems rather
bleak. Henrik has a visualization tool with a very simple JSON API behind it
(<http://stats.grok.se>), but other than that, I don't know of any efforts
to put this data into a database.
When making
http://katograph.appspot.com/ which renders the german Wiki-
pedia category system as an interactive "treemap" based on information
like number of articles in them and requests during a 3 day period, I
found that the proxy logs used for stats.grok.se are rather unreliable,
with many of the "top" pages being inplausible (articles on not very
notable subjects that have existed only for a very short time show up in
the top ten, for instance). On
http://stats.grok.se/en/top you can see
this aswell, 40 million views for `Special:Export/Robert L. Bradley, Jr`
is rather implausible, as far as human users are concerned.
Is it worth a Toolserver user's time to try to
create a database of
per-project, per-page page view statistics? Is it worth a grant from the
Wikimedia Foundation to have someone work on this? Is it worth trying to
convince Wikimedia Deutschland to assign resources? And, of course, it
wouldn't be a bad idea if Domas' first-pass implementation was improved on
Wikimedia's side, regardless.
The data that powers stats.grok.se is available for download, it should
be rather trivial to feed it into toolserver databases and query it as
desired, ignoring performance problems. But short of believing that in
December 2010 "User Datagram Protocol" was more interesting to people
than Julian Assange you would need some other data source to make good
statistics.
http://stats.grok.se/de/201009/Ngai.cc would be another ex-
ample.
--
Björn Höhrmann · mailto:bjoern@hoehrmann.de ·
http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 ·
http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 ·
http://www.websitedev.de/