[Foundation-l] dumps
Delirium
delirium at hackish.org
Wed Feb 25 04:43:23 UTC 2009
Brian wrote:
> Why not make the uncompressed dump available as an Amazon Public
> Dataset? http://aws.amazon.com/publicdatasets/
>
> You can already find DBPedia and FreeBase there. Its true that the
> uncompressed dump won't fit on a commercial drive (the largest is a
> 4-platter 500GB = 2TB drive). Cloud computing seems to be the most
> economically feasible alternative for all parties involved.
It depends on the parties--- for me as a user, it's more economically
feasible to download the dataset locally and run scripts on my own
machine, than to pay for EC2 compute time to run those scripts. But I
have free unlimited university bandwidth.
It does seem like there might be some mutual benefits to having a copy
at Amazon, for those who do prefer it. Since it would become easy to
analyze a full database dump from an Amazon EC2 compute instance, due to
it being already available on the filesystem, a number of people might
use EC2 to run their analysis scripts. From that perspective, maybe
Amazon might be persuaded to help out? Maybe they could donate some
money, equipment, or developer time to reengineer the dump process, in
return for one part of the reengineering being the addition of a routine
sync to their service?
-Mark
More information about the foundation-l
mailing list