[Foundation-l] dumps
Delirium
delirium at hackish.org
Thu Feb 26 01:42:59 UTC 2009
Brian wrote:
> Ahh ok. Anyone who wants to do processing on the full history (and there are
> a lot of these people who exist!) by definition *has* to be willing to throw
> some money at it. It simply doesn't fit on commercial drives.
I've personally never found much of a compelling reason to actually
uncompress the dump, rather than working on the stream as it's being
decompressed. 7zip decompression is pretty fast, and can use multiple
cores on multi-core machines, so it never seems to be a bottleneck, for
me at least--- I get somewhere around 30-40 MB/s typically. From what I
can tell, the top-end EC2 instances do perform rather better than that,
topping out at around 200 MB/s for sequential reads. But I don't
personally run anything that can't run 5x slower in return for being
free, and I suspect lots of analysis is of that "just let it run for a
week, who cares" variety.
I'm not going to argue that nobody could benefit from using EC2 to do
their analysis instead, but it's hardly the case that it's impossible to
do full-history analysis on commodity hardware.
-Mark
More information about the foundation-l
mailing list