If you can narrow down the request a bit that makes it more likely we'll slip something into the backup script. :) [...snip...] Is that actually what people want?
Well, given the disk space limitation, straight off the bat we know that people can't have everything they want. In particular, the all-revisions version in SQL would be way too large I suspect (extra
= 40 gig).
That only leaves SQL versions of pages_current.xml.gz, and pages_public.xml.gz. If there is space for both then that would be ideal, but I understand that may be asking too much.
Personally, although I don't want the talk pages, others might, so if there's only space for one of these two, then I think a SQL version of pages_current.xml.gz is the way to go (i.e. current revisions of all pages), because it would be applicable for the widest possible audience.
Why do that when MediaWiki comes with an import tool built-in? ;)
Because I'm not importing it into MediaWiki. I'm importing it into a database, to then run non-MediaWiki software analysing the data - in particular: looking for bad wiki syntax ( http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Syntax ), suggesting new useful redirects and disambigs ( http://en.wikipedia.org/wiki/User:Nickj/Redirects ), and searching for good potential wiki-links ( http://en.wikipedia.org/wiki/User:LinkBot - [although for this one I haven't worked out how to get those suggestions out to page authors in a really satisfactory way]). I want to improve the Wikipedia, not mirror it.
In which version?
Ideally something that works with MySQL 3.23.49, but if that's too old then something that works with MySQL 4.0.24 instead.
What about everyone who wants something slightly different?
But the database dumps have never tried to be all things to all people.
Rather they've been snapshots of the various Wikipedias at semi-regular intervals of time, which you can load into a database (specifically MySQL, but if you can get it to work in another RDBMS, then more power to you).
I'm not asking for something entirely new, rather I'm asking for an equivalent replacement for what we already had.
(Also the SQL dump output needs to actually be tested before we dedicate a few gigs to it.)
I'm happy to be a guinea pig. Just give tell me where to get it from, and I'll leave it importing overnight and report back any errors/problems and whether it worked or not.
All the best, Nick.