[Toolserver-l] Archive of visitor stats

Frédéric Schütz schutz at mathgen.ch
Thu Sep 17 20:34:07 UTC 2009


Lars Aronsson wrote:

> Are visitor stats (as produced by Domas) safely archived 
> somewhere, for example on the toolserver, where development 
> projects can easily access them for analysis?  I have made my own 
> copies of the files (I guess my plan was to use them, but this 
> hasn't started yet), but now I'm running out of disk and I 
> urgently need to clear some space on that server.
> 
> I just deleted September 2009 (last 2 weeks) and that freed 9 GB.
> 
> The oldest I have is pagecounts-20071209-180000.gz

As Platonides mentioned, they are in /mnt/user-store/stats on the 
toolserver; however, I would not call that "safely archived": one of my 
cron jobs just copies them from Domas server, and that's it.

At the moment, there should be everything starting from 1 January 2009 
(although part of it disappeared at some point, but I managed to recover 
it).

However, this is definitively not a sustainable solution in the long 
run: the files currently take 335 Gb (out of a 1.5 Tb total space).

Erik Zachte stores archives of visitor stats in a better format, 
aggregating some of the older data and storing several days of data in 
one file. I started looking into these files earlier this year, planning 
to spend some time playing with this data. One of my ideas was to 
replicate the statistical data that is on the WMF stats server somewhere 
on the toolserver -- and do it "officially" and not just by copying 
files using a personal cron job. Unfortunately, "real life" took over 
and I did not manage to continue this (and still can't). However, if 
there is any interest in improving the situation, I'd be glad to look 
into it as soon as I can.

I cc' Erik who may have more to say.

Cheers,

Frédéric



More information about the Toolserver-l mailing list