On 29/01/07, Michael Noda michael.noda@gmail.com wrote:
I'm unfamiliar with the technical aspects of creating a database dump, but is it the sort of thing that would be made better and faster by throwing more computing resources at it?
I expect the answer to this question is "yes, but the Foundation is $500,000 short, and those hypothetical servers went on the budget chopping block on January 16th." :-( <rhetorical> When's the next fundraiser? </rhetorical>
You'd have to speak to Tim or Brion, but IIRC the problem was simply that the method of generating dumps had worked fine in the past, and just collapsed under the sheer *scale* of enwiki - note that de, fr, etc, all were being done fine. It seems to have been fixed now; there was a dump released late last year, but it was the first one for quite a while.
There are certainly some decent offline projects using dumps, though, in one form or another.
Like, say, Google Earth, which I expect would be ecstatic if it could get more frequent dumps.
GE has dealt directly with the Foundation at some point, and from the reports I've seen seems to update faster than the usual dump schedule - there was a brief flurry of "misplaced" articles reported just after they released it. They may be doing something clever with periodic crawling of selected pages - as long as it's not "live mirroring", and they run a local cache, we're okay with that.