I saw mention on our tentative hardware order [1] of disk space for log files, although not specifically the traffic logs. I believe Kate mentioned we generate around 2GB per day, which means we have around ~360GB of these sitting around waiting to be processed. Would it be possible to purchase or set aside a server or two to take care of this, with the (perhaps pipedream) goal of then running webalizer every day?
[1] http://meta.wikimedia.org/wiki/Hardware_ordered_March_2005
/Alterego
I believe I mentioned this to one of these lists recently, but there are two people at the National Bureau of Economic Research (http://en.wikipedia.org/wiki/NBER) here in Cambridge, Jr. who are excited about helping produce more frequent and more comprehensive Wikipedia statistics. (they want to use them in academic studies; see [[Wikipedia:Wikiproject Wikidemia]]).
Clearly, log files would have to be processed securely within the server cluster, but some of the script writing and testing could perhaps be offloaded to an NBER programmer (for instance, I believe that certain key webalizer stats aren't accurate due to the current setup, and have to cleverly handle requests from the fr: squids).
What older stats scripts, aside from webalizer, need to be rewritten? What scripts aren't run often because of the amount of time they take? Is there anything non-Foundation developers can do to speed up the process of getting a dedicated machine to process logs?
SJ
On Apr 12, 2005 11:08 AM, Brian reflection@gmail.com wrote:
I saw mention on our tentative hardware order [1] of disk space for log files, although not specifically the traffic logs. I believe Kate mentioned we generate around 2GB per day, which means we have around ~360GB of these sitting around waiting to be processed. Would it be possible to purchase or set aside a server or two to take care of this, with the (perhaps pipedream) goal of then running webalizer every day?
[1] http://meta.wikimedia.org/wiki/Hardware_ordered_March_2005
/Alterego _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org