Debian Woody is my mortal enemy. I thought it was dead finally, but no...
Ah, you see to a distro-laggard like our company, Woody is still a very viable option, in fact it's considered quite safe and stable, whereas Sarge is considered a little bit too new and risque to be trusted yet with mission-critical stuff. That mentality will probably only change when the updates for woody stop (which I believe is scheduled to happen on 1/05/2006, unless Etch is released before that date, which honestly seems rather unlikely). So only after the 1st May 2006 will Woody finally really be dead :-)
On the data import front, today I've tried installing MediaWiki 1.5RC4, and importing via importDump, but I didn't have much luck: ====================================================== # wget http://download.wikimedia.org/wikipedia/en/20050909_pages_public.xml.gz # gzip --test ~nickj/wikipedia/20050909_pages_public.xml.gz // no output or error, so presumably the gzip file is not corrupt # md5sum ~nickj/wikipedia/20050909_pages_public.xml.gz 1de5093f1dd6c5afd4ed080474456d54 /home/nickj/wikipedia/20050909_pages_public.xml.gz // this matches the sum in http://download.wikimedia.org/wikipedia/en/20050909-md5sums # gzip -dc ~nickj/wikipedia/20050909_pages_public.xml.gz | php maintenance/importDump.php 100 (58.59332294078 pages/sec 58.59332294078 revs/sec) 200 (54.030806663729 pages/sec 54.030806663729 revs/sec) 300 (51.490838593282 pages/sec 51.490838593282 revs/sec) 400 (50.320459245887 pages/sec 50.320459245887 revs/sec) 500 (49.26519486778 pages/sec 49.26519486778 revs/sec) 600 (48.360482472114 pages/sec 48.360482472114 revs/sec) 700 (48.871787388943 pages/sec 48.871787388943 revs/sec) 800 (49.154366513935 pages/sec 49.154366513935 revs/sec) 900 (49.266177573961 pages/sec 49.266177573961 revs/sec) 1000 (49.018343826441 pages/sec 49.018343826441 revs/sec) 1100 (49.167214599006 pages/sec 49.167214599006 revs/sec) 1200 (49.605583964957 pages/sec 49.605583964957 revs/sec) 1300 (49.425412530694 pages/sec 49.425412530694 revs/sec) 1400 (49.357795012659 pages/sec 49.357795012659 revs/sec) 1500 (49.401458453695 pages/sec 49.401458453695 revs/sec) 1600 (49.248578795592 pages/sec 49.248578795592 revs/sec) 1700 (49.205397806241 pages/sec 49.205397806241 revs/sec) 1800 (49.139689484041 pages/sec 49.139689484041 revs/sec) 1900 (49.369342847918 pages/sec 49.369342847918 revs/sec) 2000 (49.706945229133 pages/sec 49.706945229133 revs/sec) 2100 (49.860871622316 pages/sec 49.860871622316 revs/sec) 2200 (49.935237390351 pages/sec 49.935237390351 revs/sec) 2300 (49.976472942288 pages/sec 49.976472942288 revs/sec) 2400 (49.965834881883 pages/sec 49.965834881883 revs/sec) 2500 (50.095592279086 pages/sec 50.095592279086 revs/sec) 2600 (49.913163596511 pages/sec 49.913163596511 revs/sec) 2700 (50.346513647263 pages/sec 50.346513647263 revs/sec) 2800 (50.554639314109 pages/sec 50.554639314109 revs/sec) 2900 (50.30025952798 pages/sec 50.30025952798 revs/sec) 3000 (50.235683978552 pages/sec 50.235683978552 revs/sec) 3100 (49.935743124336 pages/sec 49.935743124336 revs/sec) 3200 (49.898859597319 pages/sec 49.898859597319 revs/sec) Content-type: text/html
# // (i.e. spontaneously aborts after ~3200 pages and ~60 seconds). ======================================================
I'm beginning to suspect that some kind of higher-being is determined that under no circumstances will I be able to load this data into a database ;-)
All the best, Nick.