[MediaWiki-l] Import Performance Issue/Question

LCJones lcjones at jonesipedia.com
Thu Dec 31 23:19:45 UTC 2015


On 12/31/2015 1:23 PM, Daniel Barrett wrote:
> Chap Jones writes:
>> It takes an excruciatingly long time to import even a relatively small wikipedia page into my wiki.
> Are you importing the page using Special:Import, or are you running the script maintenance/importDump.php?
>
> If you're using Special:Import, are things any faster if you use importDump.php?
>
> DanB
>
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
DanB,

Thanks for pointing out the importdump script. Indeed, the importdump
script takes as long as Special:Import. I was running top and tcpdump in
separate terminals during a small (484k xml), hopefully to see some
oddity. But no joy. However, Apache eats a lot of CPU during the import
process and I did notice resets in tcpdump, especially from
upload-lb.eqiad.wikimedia.org.

17:48:26.945739 IP upload-lb.eqiad.wikimedia.org.https >
farmsrv.scrubbed.net.42515: Flags [F.], seq 4437, ack 673, win 61,
options [nop,nop,TS val 528874286 ecr 481172104], length 0
17:48:26.945871 IP farmsrv.scrubbed.net.42515 >
upload-lb.eqiad.wikimedia.org.https: Flags [P.], seq 673:704, ack 4438,
win 319, options [nop,nop,TS val 481172115 ecr 528874286], length 31
17:48:26.945922 IP farmsrv.scrubbed.net.42515 >
upload-lb.eqiad.wikimedia.org.https: Flags [R.], seq 704, ack 4438, win
319, options [nop,nop,TS val 481172116 ecr 528874286], length 0
17:48:26.990295 IP upload-lb.eqiad.wikimedia.org.https >
farmsrv.scrubbed.net.42515: Flags [R], seq 2904368554, win 0, length 0
17:48:27.162347 IP farmsrv.scrubbed.net.42517 >
upload-lb.eqiad.wikimedia.org.https: Flags [S], seq 1673842725, win
29200, options [mss 1460,sackOK,TS val 481172170 ecr 0,nop,wscale 7],
length 0

I'm pretty sure the reset is normal syn/ack/fin clean up.

I just did a single 1.2m Special:Import and it took 9 minutes.

In contrast, using wget (574mb file)
Saving to:
‘ubuntu-14.04.3-server-amd64.iso?_ga=1.156029886.425328332.1451603483’

100%[==========>] 601,882,624 4.79MB/s   in 94s
2015-12-31 18:14:31 (6.11 MB/s) -
‘ubuntu-14.04.3-server-amd64.iso?_ga=1.156029886.425328332.1451603483’
saved [601882624/601882624]

Onward I trudge ..   ;)

Chap



More information about the MediaWiki-l mailing list