On Sat, 03 Jan 2004 11:41:59 +0000, Nick Hill wrote:
Do we have means to replay typical wikipedia activity
from a log file? I
am thinking of a PERL script which reads a real wikipedia common log
file replaying the same load pattern at defineable speeds.
If someone has a pre-configured server image and some real log files, I
can do this.
I have already written a simple PERL script which can be modified to
replay server load in real time from log files.
http://www.cs.virginia.edu/~rz5b/software/logreplayer-manual.htm
could be useful, or
http://opensta.org/.
Even more useful would be a full incoming traffic dump and a full disk
image of the start condition (it's mainly the editing that is
interesting). tcpdump and tcpreplay might be good candidates for this.
If it turns out that the minimum server for the current database
indeed would be a 64bit, 8Gb 'cheap' machine- then it might get pretty
hard when the database grows (especially with multiple languages etc).
Just an import of
http://www.meyers-konversationslexikon.de/ into the
german wikimedia (not to think of its nice illustrations) would create
problems i guess.
A compressed SCP connection using blowfish cypher for
English text
between two AMD Athlon XP2200+ CPUs gives a throughput of
3.3Megabytes/sec. The majority of the load is on the sending machine. A
dual Opteron 240 with a 64 bit optimised cipher algorithm can probably
transfer 12Mb/sec or more.
The sending machine would have to create compressed & encrypted streams
for whatever the number of mirrors is. It's not its main job, but
something that takes some cpu time away from doing the database work etc.
How high was the cpu load of just the scp process on the Athlon?
Gabriel Wicke