On Sat, Jan 28, 2012 at 10:56:27AM +0200, Ariel T. Glenn wrote:
Sure, but that assumes you are not already using those other cores for something else. In our case, we are ;-)
So do we (the Perl scripts use e.g. http://search.cpan.org/~dlux/Parallel-ForkManager/, so we have several bunzip2 processes). Despite SSDs, RAID array etc. the machines still have an abundance of CPU power, so naturally I would like to throw CPU time at some problems instead of seeing 2-4% CPU and diskwaits. :-)
pbunzip2 allows me to do that under certain circumstnaces (uncompressing pbzip2 packed archive). I can read in the whole archive to memory, perform a parallel decompression and then shove it to the disks en bulk.
I have no problem in continuing with the status quo (several processes on SMP), but I still see a CPU load of just 50% on average. Despite hyperthreading, of the 16 cores (8 phys + 8 virt), only ~4 physical CPUs are under load. I guess the machine will get some VMs to host then. :-)
Situation changes if we repack the archives with pbzip2, then we get during unpack a load of 6 to 6,5 phys CPUs per machine and 4x speedup. But repacking is void in everyday use, as we inspect (unpack) every wiki archive just once. We did the repacking just for evaluation purposes.
Ok, keep the status as is, but my plea is to think in future about the growing number of CPU cores in the machines of the users of your dumps. ;-)
regards,