One week ago, Domas Mituzas started to publish hourly logfiles that summarize the page views for each page of Wikipedia in all languages. I've downloaded these files and tried to dig out some interesting information. I also tried (user:LA2) to spark a discussion on the "village pump" of some languages about how these statistics could be made useful. See for example [[cs:Wikipedie:Pod lípou]], [[de:Wikipedia:Café]], [[nl:Wikipedia:De_kroeg]], [[ru:Википедия:Форум/Технический]], [[sv:Wikipedia:Bybrunnen]].
The only clear answer for now is that a simple weekly summary of the very most popular (a few hundred or so) pages could serve as a to-do list for focusing improvement work. This is similar to the existing "WikiCharts top 100" on the Toolserver, http://tools.wikimedia.de/~leon/stats/wikicharts/
The new logfiles allow a much more accurate, up-to-date and deeper analysis. Is anybody developing a Toolserver service for analyzing and presenting such statistics?
From "long tail" theory I borrowed the idea to use the logarithm
of the rank as the metric, and this seems to be very useful. The actual page view count can vary, but many articles keep their stable ranking over many days. When a page moves from rank 4711 to rank 471, that is a factor of 10 (logarithm +1.0), just as much of an achievement as moving from rank 471 to rank 47.
The day-to-day trend analysis (see village pumps) is fascinating to look at, for example how [[Ike Turner]] quickly became a prominent article at the announcement of his death, but I don't know if this analysis has any real practical use.
It would be nice to correlate page popularity to page quality. However, I'm not sure how to determine page quality.
One simple metric could be the number of interlanguage links. If an article is among the top 50 in one language, it probably deserves being translated into other languages.
Perhaps the page count analysis should be combined with category tree tools already available on the Toolserver? One could present a category as a "tag cloud" with large fonts for the more popular pages.
On 16.12.2007 16:27:47, Lars Aronsson wrote:
The only clear answer for now is that a simple weekly summary of the very most popular (a few hundred or so) pages could serve as a to-do list for focusing improvement work. This is similar to the existing "WikiCharts top 100" on the Toolserver, http://tools.wikimedia.de/~leon/stats/wikicharts/
It's just these suck and are going to be replaced by something analysing the squid log stream.
Leon
wikitech-l@lists.wikimedia.org