Jimmy (Jimbo) Wales a écrit :
Ashar Voultoiz wrote:
Yes. Not enough cpu / disk space to generate statistics. iirc the last attempt resulted in a total block of the whole cluster. Hopefully one server will be dedicated to handling logs in a near future and webalizer stats will then be build again.
I strongly support the idea of a machine for handling logs, statistics, and research into those things. I think we could learn a lot about how the community really functions by studying logs, and the traffic stats are important for us to understand and project our growth patterns.
My question to wikitech-l: what sort of machine should we use for this? I am thinking that a fully loaded dual Opteron is overkill, but a typical apache is too small?
Probably we'd want to have a lot of disk space, possibly RAID 5 as the best balance between storage space, redundancy, and speed. (We don't want to lose a huge chunk of data to a bad hard drive, but on the other hand we don't need absolute speed either.)
Probably we'd want a decent CPU, but it doesn't have to be top-notch, since these are batch jobs and the machine should not be doing anything else anyway.
Your thoughts?
--Jimbo _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
A storage server like http://www.siliconmechanics.com/i1547/serial-ata-storage-server.php would be nice I think. We could use it to store logs, backup, compute logs, compute stats
Not really need to be fast a config example ::
Dual Xeon 2,4Ghz, 1GB 12 seagate 400GB drive (not sure we need all this space)
raid 5 with 1 (2?) hot spare, that mean with 1 hot spare 10x400GB = 4To of space to store logs and backup (backup take more and more place every day, commons yet eat ~6GB)
cost : 9359$
with 12x250GB, cost fall to 6335$ , with 1 hot spare 10x250GB = 2,5To
My 2 cents