[Foundation-l] Dump process needs serious fix / was Release of squid log data

Anthony wikimail at inbox.org
Sun Sep 16 00:38:27 UTC 2007


On 9/15/07, Samuel Klein <sj at laptop.org> wrote:
> Can someone elaborate on what is going on here?  What are the steps
> involved, and why does this take so long?  It would take less time to copy
> a terabyte of data to a spare disk, drive it to a world-class computing
> cluster anywhere in the country, and have the dumps worked on there
> (including people figuring out another implementation of the dump
> process). Maybe said computing cluster could also become the de facto
> mirror-and- statistics center for Wikipedia data, where researchers would
> send complex queries to be run.
>
If the WMF would offer a cheap live feed (for say 2x costs), I'm sure
the private market would happily solve the problem.



More information about the foundation-l mailing list