[Toolserver-l] Page view stats

Alex mrzmanwiki at gmail.com
Mon Mar 29 23:01:46 UTC 2010


On 3/29/2010 6:08 PM, River Tarnell wrote:
> Lars Aronsson:
>> Is page view statistics (as in stats.grok.se) being imported to the
>> toolserver and available in a database for quick reference?
> 
> A user has made this available in raw form at /mnt/user-store/stats.  We
> are currently working on making this more official; that'll probably be
> announced in a month or two.
> 
> It's not available in the database yet, but that's something we're
> looking at doing.  If anyone else has a particular reason to need this
> data, it would help if they could describe it, so we can decide how to
> format the data, and how detailed it needs to be.
> 
> 	- river.

I use the pageview data currently, for similar reasons that Lars
mentioned. I produce monthly pageview data reports for Wikiprojects on
enwiki.[1] Currently I only keep the data for the projects that are
subscribed to the service (305 projects, ~2.3 million pages for April 2010)

If it were in the database, it could be easier, though I'd have to
rewrite a lot of stuff. I only need monthly data for what I'm currently
doing, but daily data could potentially be useful for other things.

The current (raw) format is fine for me, though it would be nice if the
files used a more consistent naming. Most are in the form:
pagecounts-YYYYMMDD-HH0000.gz
but every once in a while,
pagecounts-YYYYMMDD-HH0001.gz
and very rarely, things like
pagecounts-YYYYMMDD-HH2001.gz
and a few other variations. Also, every once in a while, there are long
delays before files are available or files are missing entirely.

If there is interest in having this in the database somewhere, I might
be able to help out in terms of coding, as much of what I have could be
fairly easily adapted to extract the data for all pages on all projects,
rather than just the ones on the English Wikipedia that my tool needs.

[1] http://toolserver.org/~alexz/pop/view.php

-- 
Alex (wikipedia:en:User:Mr.Z-man)



More information about the Toolserver-l mailing list