[WikiEN-l] Wikipedia in a box

Anthony wikimail at inbox.org
Tue Aug 14 10:30:50 UTC 2007


On 8/14/07, David Gerard <dgerard at gmail.com> wrote:
> http://it.slashdot.org/article.pl?sid=07/08/13/1939231
>
bzip2recover.  Genius.  I've been wanting to do something like this
for a long time, and the one thing standing in my way was that I
couldn't figure out how to do the random access bit.

For those too lazy to read, here's the stroke of genius (IMHO): "We
would certainly prefer not to use MySQL or any other database, since
we are only *reading*  Wikipedia, not writing into it. [....] we can
use the bzip2recover tool (part of bzip2 distribution) to "recover"
the individual parts of this compressed file: Basically, BZIP splits
its input into 900K (by default) size blocks, [....] What this means,
in plain English, is that we can convert the huge downloaded .bz2 file
to a large set of small (smaller than 1MB) files, each one
individually decompressible!"

Now, what's the applicable command to do this with the .7zip file?



More information about the WikiEN-l mailing list