I noticed that stats of Webalizer are not updated since october.
http://wikimedia.org/stats/nl.wikipedia.org//
Is that normal?
Those, http://en.wikipedia.org/wikistats/NL/Sitemap.htm are still updated. So there is no real need for. It is only to know.
[[gebruiker:walter]]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Walter Vermeir wrote:
I noticed that stats of Webalizer are not updated since october.
http://wikimedia.org/stats/nl.wikipedia.org//
Is that normal?
Yes. Not enough cpu / disk space to generate statistics. iirc the last attempt resulted in a total block of the whole cluster. Hopefully one server will be dedicated to handling logs in a near future and webalizer stats will then be build again.
Those, http://en.wikipedia.org/wikistats/NL/Sitemap.htm are still updated. So there is no real need for. It is only to know.
[[gebruiker:walter]]
This is using the sql dump and looks like it can still be run on one of the servers.
cheers,
- -- Ashar Voultoiz - WP++++ http://en.wikipedia.org/wiki/User:Hashar Servers in trouble ? noc (at) wikimedia (dot) org "This signature is a virus. Copy me in yours to spread it."
Ashar Voultoiz wrote:
Yes. Not enough cpu / disk space to generate statistics. iirc the last attempt resulted in a total block of the whole cluster. Hopefully one server will be dedicated to handling logs in a near future and webalizer stats will then be build again.
I strongly support the idea of a machine for handling logs, statistics, and research into those things. I think we could learn a lot about how the community really functions by studying logs, and the traffic stats are important for us to understand and project our growth patterns.
My question to wikitech-l: what sort of machine should we use for this? I am thinking that a fully loaded dual Opteron is overkill, but a typical apache is too small?
Probably we'd want to have a lot of disk space, possibly RAID 5 as the best balance between storage space, redundancy, and speed. (We don't want to lose a huge chunk of data to a bad hard drive, but on the other hand we don't need absolute speed either.)
Probably we'd want a decent CPU, but it doesn't have to be top-notch, since these are batch jobs and the machine should not be doing anything else anyway.
Your thoughts?
--Jimbo
On Jan 5, 2005, at 5:44 AM, Jimmy (Jimbo) Wales wrote:
I strongly support the idea of a machine for handling logs, statistics, and research into those things. I think we could learn a lot about how the community really functions by studying logs, and the traffic stats are important for us to understand and project our growth patterns.
My question to wikitech-l: what sort of machine should we use for this? I am thinking that a fully loaded dual Opteron is overkill, but a typical apache is too small?
It doesn't have to run super fast, it just has to run independently and therefore not kill our other, vital machines.
The problem we had before was that these things were run on Zwinger, and it ate up all the memory, scarfed up all the disk bandwidth, and went into swap for hours on end, sending everything into a downward spiral of doom.
-- brion vibber (brion @ pobox.com)
Jimmy (Jimbo) Wales a écrit :
Ashar Voultoiz wrote:
Yes. Not enough cpu / disk space to generate statistics. iirc the last attempt resulted in a total block of the whole cluster. Hopefully one server will be dedicated to handling logs in a near future and webalizer stats will then be build again.
I strongly support the idea of a machine for handling logs, statistics, and research into those things. I think we could learn a lot about how the community really functions by studying logs, and the traffic stats are important for us to understand and project our growth patterns.
My question to wikitech-l: what sort of machine should we use for this? I am thinking that a fully loaded dual Opteron is overkill, but a typical apache is too small?
Probably we'd want to have a lot of disk space, possibly RAID 5 as the best balance between storage space, redundancy, and speed. (We don't want to lose a huge chunk of data to a bad hard drive, but on the other hand we don't need absolute speed either.)
Probably we'd want a decent CPU, but it doesn't have to be top-notch, since these are batch jobs and the machine should not be doing anything else anyway.
Your thoughts?
--Jimbo _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
A storage server like http://www.siliconmechanics.com/i1547/serial-ata-storage-server.php would be nice I think. We could use it to store logs, backup, compute logs, compute stats
Not really need to be fast a config example ::
Dual Xeon 2,4Ghz, 1GB 12 seagate 400GB drive (not sure we need all this space)
raid 5 with 1 (2?) hot spare, that mean with 1 hot spare 10x400GB = 4To of space to store logs and backup (backup take more and more place every day, commons yet eat ~6GB)
cost : 9359$
with 12x250GB, cost fall to 6335$ , with 1 hot spare 10x250GB = 2,5To
My 2 cents
wikitech-l@lists.wikimedia.org