On 8/8/07, Lars Aronsson <lars(a)aronsson.se> wrote:
One way around this could perhaps be to import the
full XML dump
into a local MySQL, then shutting down MySQL and putting the
frozen files from /var/lib/mysql/ onto a Ubuntu "live" CDROM or
DVD. Just boot up Ubuntu Linux from this disk, and you can
immediately search through the full Wikipedia database. (As long
as that fits on the disk...). I have not tried it, and don't know
if it's possible to run MySQL from a live disk with pre-loaded
tables. Someone with extra time on their hands can find a new
hobby here. If one person does this, everybody else can download
and burn the CDROM image and get started much faster than waiting
for a 30 hour database import.
An interesting idea, but not very useful for people who intend to
actually reuse the data, who would then have to dump *and* import it.
It would possibly be handy for people who just want to do a
statistical analysis or something, but not if you're trying to mirror.
And no, the full Wikipedia database is way, way too large to fit on a
DVD. Even the pages-articles one (articles/templates/image
descriptions/primary meta-pages) is quite a lot too large, it looks
like, at 2.7 GB bz2'd. And that's leaving out images.