Tim Starling schrieb:
Brion Vibber wrote:
Christoph Litauer wrote:
Thanks, but I already figured mwdumper out: "Future versions of mwdumper will include support for creating a database and configuring a MediaWiki installation directly, but currently it just produces raw SQL which can be piped to MySQL."
Yes, you have to run tables.sql into your database as well. Painful, I know. ;)
I already produced raw SQL (using mwimport), so it's not the XML to SQL conversion that is the bottleneck. I think mwdumper just improves this step but not the data import to the database.
I don't know anything about this "mwimport" tool, but mwdumper uses batch inserts and the README includes a number of tips about speeding up the SQL end. You might want to check if you're doing this already.
He probably means this:
http://meta.wikimedia.org/wiki/Data_dumps/mwimport
It claims to be faster than mwdumper due to lower CPU usage during XML parsing. I suspect you could get the same speedup by putting "bfr" in the pipeline, since I very much doubt you'd max out the CPU while piping into MySQL, if the whole thing was properly multithreaded.
The problem in Christoph Litauer's case is most probably insufficent memory and disk resources, possibly coupled with a poorly tuned MySQL server. Fixing that is probably a better topic for a MySQL support query than a wikitech-l mailing list thread.
I totally agree! I hoped to get statements like "same for me" or "things run about 20 times faster here" -- but I didn't ask for it, that's right. I couldn't find any hints how fast the imports "normally" run, and as a result if it's worth to spend time optimizing my mysql server. Seems as if it is worth, so I will take a look at that point. Thank you all for the answers and hints.