Le mar 16/02/10 14:13, "Jamie Morken" jmorken@shaw.ca a écrit:
What is the benefit of the database dumps being archived/distributed in xml format instead of sql format? Converting the xml to sql takes a long time for big wiki's and people seem to have problems with this step, so why isn't the sql format available for download instead of the xml format?
* Your are DB neutral, so you do not need to have a version for mysql, for postgres... * You may apply filter easily * The XML is still usefull after a DB schema upgrade
Emmanuel
- Your are DB neutral, so you do not need to have a version for
mysql, for postgres...
The mysql can be converted to postgres and vice versa directly, so the xml
isn't necessary for this, see: http://www.mediawiki.org/wiki/Manual:PostgreSQL
- You may apply filter easily
- The XML is still usefull after a DB schema upgrade
The xml can be exported from the sql database I am sure, and I bet this tool works well as wikimedia uses it to create the xml dumps right? So if people want to filter they can use this well maintained tool first. Also if there is a DB schema upgrade, then the XML is only useful if the tool to convert from xml to sql has been updated, which in REALITY apparently the tools are not so good right now. This leads me to believe that the sql would be a more useful format to share.. also does wikimedia foundation actually use the xml dumps internally for anything?
Ok, the simple question: how many people prefer XML or sql dumps?
cheers, Jamie
Emmanuel
* Jamie Morken jmorken@shaw.ca [Tue, 16 Feb 2010 07:03:18 -0800]:
Ok, the simple question: how many people prefer XML or sql dumps?
I use XML dumps at my own much smaller wikis all the time. The advantages are already described by emmanuel@engelhart.org I never tried to load huge wp-en dumps, though. I've head that importDump.php is too slow or even may be troublesome (memory/resource leaks?) with these huge dumps. That probably can be fixed to dump in chunks, then restart the php process. I have suspection towards non-core tools, like mwdumper, though I never tried that (my dumps are not so huge). My voice also has low weight, I am not a core developer. Dmitriy
On 2/16/10 7:03 AM, Jamie Morken wrote:
Ok, the simple question: how many people prefer XML or sql dumps?
I think we have a FAQ on this...
http://meta.wikimedia.org/wiki/Download#What_happened_to_the_SQL_dumps.3F
You *do* realize that such "SQL dumps" would have to be invented from whole cloth and couldn't just be dumped from the actual databases, right?
The raw databases include dozens of alternate clusters and have data from different revisions compressed together, including deleted items and private data, and can't simply be released by WMF even if someone actually wanted to figure out how to replicate Wikimedia's exact storage cluster layout to do a data import.
Most likely if they were created they'd simply be created by running the xml through a tool like mwdumper...
-- brion
wikitech-l@lists.wikimedia.org