This would triple the disk space requirements for the data dumps (quadruple after the next major upgrade, quintuple the time after that...)
Surely it should only double the disk space requirements? XML format dumps I would say are the same size, or possibly even slightly larger, than SQL dumps. After all, the main content is the article text, and that's the same in both (apart from some extra slashes in SQL), but you lose the overhead of the XML gumpf. I'd be surprised if wasn't a wash, or close enough to as makes no real-world difference.
Besides, most people I think probably don't want every revision ever. Nor do they probably want talk pages. In other words, one extra file, namely the SQL version of pages_public.xml.gz, whose size is going to almost the same. For EN, the largest of all, that's only ~ 900 megs. For 900 meg it stops people whining.
you can transform to whatever local format you need. (And we provide software for you to do that if you like.)
What most people need is get it into a database for further work, and the fact there's software for this at all shows there's demand for it.
And what's the point of every user who wants an SQL dump downloading the XML version, downloading mwdumper, downloading mono, setting up mono, running mwdumper, and creating the dump? Wouldn't it make more sense to run the conversion software as part of a general fortnightly database dump cron job that did all the XML stuff, then took the XML file, converted it to SQL, and compressed it? That way the problem is solved once, in one place, forever, for all users who want SQL format.
and maybe a couple people might use some of them every once in
Au contraire - most people who want dumps will use them all the time!
Tell you what: If you don't believe me, try making one, uploading it, and then _next_ dump add a README that says "SQL dumps have been discontinued due to a lack of interest and demand from users. If you disagree, please address your comments to Brion on the wikitech-l mailing list (email: wikitech-l@wikimedia.org)". And then see what happens. :-)
Also, can we please have back the "is_redirect" field in the XML (and
Hmm, can probably do that yeah.
Sounds great, thank you!
All the best, Nick.