[Foundation-l] dumps

Delirium delirium at hackish.org
Thu Feb 26 01:42:59 UTC 2009


Brian wrote:
> Ahh ok. Anyone who wants to do processing on the full history (and there are
> a lot of these people who exist!) by definition *has* to be willing to throw
> some money at it. It simply doesn't fit on commercial drives.
I've personally never found much of a compelling reason to actually 
uncompress the dump, rather than working on the stream as it's being 
decompressed. 7zip decompression is pretty fast, and can use multiple 
cores on multi-core machines, so it never seems to be a bottleneck, for 
me at least--- I get somewhere around 30-40 MB/s typically. From what I 
can tell, the top-end EC2 instances do perform rather better than that, 
topping out at around 200 MB/s for sequential reads. But I don't 
personally run anything that can't run 5x slower in return for being 
free, and I suspect lots of analysis is of that "just let it run for a 
week, who cares" variety.

I'm not going to argue that nobody could benefit from using EC2 to do 
their analysis instead, but it's hardly the case that it's impossible to 
do full-history analysis on commodity hardware.

-Mark




More information about the foundation-l mailing list