On 09/09/2010 10:54 PM, Jamie Morken wrote:
Hi all,
If anyone can help with #2 to provide the access log of image usage stats please send me an email! 2. sort the image list based on usage frequency from access log files
The raw data is one file per hour, containing a list of page names and visit counts. From just one such file, you get statistics on what's the most visited pages during that particular hour. By combining more files, you can get statistics for a whole day, a week, a month, a year, all Mondays, all 7am hours around the year, the 3rd Sunday after Easter, or whatever. The combinations are almost endless.
How do we boil this down to a few datasets that are most useful? Is that the total visit count per month? Or what?
Are these visitor stats already in a database on the toolserver? If so, how are they organized?
I wrote some documentation on the access log format here, http://www.archive.org/details/wikipedia_visitor_stats_200712