Simetrical wrote:
And of course the binary files are worthless to you if
you're
importing into MyISAM, PostgreSQL, Oracle (experimental though support
may be), a different version of InnoDB, or any other database you
might want to write support for, or for that matter a significantly
different version of MediaWiki. That's why XML files are preferred,
Both the current SQL and XML dumps are similar to unpacking a tar
(or zip) archive into the filesystem, your own computer happens to
use. What I wonder is if there is no equivalent to downloading an
ISO 9660 CDROM image and mounting that filesystem (perhaps
read-only), as it is. Modern Unix dialects can "mount" disk
images from files without burning an actual disk. Mounting an
existing filesystem is instantaneous, and as you "cd" and "ls"
down into its file tree, more and more of its inodes will be
buffered in the RAM managed by the running kernel. What I could
find useful is "mounting" (rather than "importing") a frozen copy
of a database, then "use" this new database, "show tables",
"describe page", and "select count(*) from page". Apparently
MySQL doesn't support this. Does PostgreSQL or Oracle or any
other RDBMS?
One way around this could perhaps be to import the full XML dump
into a local MySQL, then shutting down MySQL and putting the
frozen files from /var/lib/mysql/ onto a Ubuntu "live" CDROM or
DVD. Just boot up Ubuntu Linux from this disk, and you can
immediately search through the full Wikipedia database. (As long
as that fits on the disk...). I have not tried it, and don't know
if it's possible to run MySQL from a live disk with pre-loaded
tables. Someone with extra time on their hands can find a new
hobby here. If one person does this, everybody else can download
and burn the CDROM image and get started much faster than waiting
for a 30 hour database import.
Not that this is much of an API queston anymore.
Correct. Sorry for going off topic.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se