On Sun, 2007-11-18 at 11:20 -0500, Anthony wrote:
Don't credit me with the idea though, I stole the idea from Thanassis Tsiodras (http://www.softlab.ntua.gr/~ttsiod/buildWikipediaOffline.html)
Looks interesting, though I'm skeptical of his assertion here:
"Orders of magnitude faster to install (a matter of hours) compared to loading the "dump" into MySQL"
I've posted my results here many times, and the largest wiki out there (enwiki) takes ~40 minutes, _max_ to load up into a clean, cold-booted MySQL instance, from the XML source, using mysql and redirection (not using mwdumper in a pipe).
The target machine is a dual-core AMD64/2.4 machine with 2gb RAM using a single SATA drive. Not a powerhouse by any means. Give me a faster machine with a lot more RAM and faster disks, and I bet I could cut that down to less than 20 minutes.
So I think his logic is backwards. If it takes "a matter of hours" to install his version, that is SIGNIFICANTLY slower than using mwdumper and mysql directly, on the largest wiki dump available.
I can handle an awful lot in a single thread, using the API. I have no idea if it'd hurt the server to do so, though.
Ideally, this should work _on_ the dump, not on the live server(s).