Simetrical wrote:
And of course the binary files are worthless to you if you're importing into MyISAM, PostgreSQL, Oracle (experimental though support may be), a different version of InnoDB, or any other database you might want to write support for, or for that matter a significantly different version of MediaWiki. That's why XML files are preferred,
Both the current SQL and XML dumps are similar to unpacking a tar (or zip) archive into the filesystem, your own computer happens to use. What I wonder is if there is no equivalent to downloading an ISO 9660 CDROM image and mounting that filesystem (perhaps read-only), as it is. Modern Unix dialects can "mount" disk images from files without burning an actual disk. Mounting an existing filesystem is instantaneous, and as you "cd" and "ls" down into its file tree, more and more of its inodes will be buffered in the RAM managed by the running kernel. What I could find useful is "mounting" (rather than "importing") a frozen copy of a database, then "use" this new database, "show tables", "describe page", and "select count(*) from page". Apparently MySQL doesn't support this. Does PostgreSQL or Oracle or any other RDBMS?
One way around this could perhaps be to import the full XML dump into a local MySQL, then shutting down MySQL and putting the frozen files from /var/lib/mysql/ onto a Ubuntu "live" CDROM or DVD. Just boot up Ubuntu Linux from this disk, and you can immediately search through the full Wikipedia database. (As long as that fits on the disk...). I have not tried it, and don't know if it's possible to run MySQL from a live disk with pre-loaded tables. Someone with extra time on their hands can find a new hobby here. If one person does this, everybody else can download and burn the CDROM image and get started much faster than waiting for a 30 hour database import.
Not that this is much of an API queston anymore.
Correct. Sorry for going off topic.