Folks,
Performance Issue/Question
MediaWiki:1.26.2 :: Apache:2.4.16 (Ubuntu) :: PHP:5.6.16-2+deb.sury.org~trusty+1 MySQL:5.6.27-0ubuntu0.14.04.1-log :: Processor:Intel Xeon QuadCore X3210 :: Memory: 3 Gig HD: 1.5Tb @ 126g usage :: 64Bit System *****************************************
I have a very small single user(me) wiki in use for genealogy work. I have very little traffic visiting the site. I keep the wiki updated fairly religiously, including extensions as well as the physical server itself. I have Zend OPCache running on this server as well.
My issue has always been a speed issue in importing XML files that have been exported from Wikipedia. As an example, I may export a wikipedia page about a particular town or county that is relevant to my genealogy research and import the XML into my wiki. It takes an excruciatingly long time to import even a relatively small wikipedia page into my wiki. I'm talking 5+ minutes to import a single page such as this, https://en.wikipedia.org/wiki/Guilford_County,_North_Carolina, into my wiki.
I've been trying to fix this issue for a couple of years and have not figured out why it takes so long to import the wikipedia pages.
I'd like to mention here that my physical server is located in a full blown data center with 100Mb switch ports on a gigabit fiber network and out through a 10 gig pipe to the world. Network access is not a problem.
I guess my question is, what can I do to improve the import process?
Thanks,
Chap wiki.jonesipedia.com ********************
Chap Jones writes:
It takes an excruciatingly long time to import even a relatively small wikipedia page into my wiki.
Are you importing the page using Special:Import, or are you running the script maintenance/importDump.php?
If you're using Special:Import, are things any faster if you use importDump.php?
DanB
On 12/31/2015 1:23 PM, Daniel Barrett wrote:
Chap Jones writes:
It takes an excruciatingly long time to import even a relatively small wikipedia page into my wiki.
Are you importing the page using Special:Import, or are you running the script maintenance/importDump.php?
If you're using Special:Import, are things any faster if you use importDump.php?
DanB
MediaWiki-l mailing list To unsubscribe, go to: https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
DanB,
Thanks for pointing out the importdump script. Indeed, the importdump script takes as long as Special:Import. I was running top and tcpdump in separate terminals during a small (484k xml), hopefully to see some oddity. But no joy. However, Apache eats a lot of CPU during the import process and I did notice resets in tcpdump, especially from upload-lb.eqiad.wikimedia.org.
17:48:26.945739 IP upload-lb.eqiad.wikimedia.org.https > farmsrv.scrubbed.net.42515: Flags [F.], seq 4437, ack 673, win 61, options [nop,nop,TS val 528874286 ecr 481172104], length 0 17:48:26.945871 IP farmsrv.scrubbed.net.42515 > upload-lb.eqiad.wikimedia.org.https: Flags [P.], seq 673:704, ack 4438, win 319, options [nop,nop,TS val 481172115 ecr 528874286], length 31 17:48:26.945922 IP farmsrv.scrubbed.net.42515 > upload-lb.eqiad.wikimedia.org.https: Flags [R.], seq 704, ack 4438, win 319, options [nop,nop,TS val 481172116 ecr 528874286], length 0 17:48:26.990295 IP upload-lb.eqiad.wikimedia.org.https > farmsrv.scrubbed.net.42515: Flags [R], seq 2904368554, win 0, length 0 17:48:27.162347 IP farmsrv.scrubbed.net.42517 > upload-lb.eqiad.wikimedia.org.https: Flags [S], seq 1673842725, win 29200, options [mss 1460,sackOK,TS val 481172170 ecr 0,nop,wscale 7], length 0
I'm pretty sure the reset is normal syn/ack/fin clean up.
I just did a single 1.2m Special:Import and it took 9 minutes.
In contrast, using wget (574mb file) Saving to: ‘ubuntu-14.04.3-server-amd64.iso?_ga=1.156029886.425328332.1451603483’
100%[==========>] 601,882,624 4.79MB/s in 94s 2015-12-31 18:14:31 (6.11 MB/s) - ‘ubuntu-14.04.3-server-amd64.iso?_ga=1.156029886.425328332.1451603483’ saved [601882624/601882624]
Onward I trudge .. ;)
Chap
On 2015-12-31, LCJones lcjones@jonesipedia.com wrote:
On 12/31/2015 1:23 PM, Daniel Barrett wrote:
Chap Jones writes:
It takes an excruciatingly long time to import even a relatively small wikipedia page into my wiki.
Are you importing the page using Special:Import, or are you running the script maintenance/importDump.php?
If you're using Special:Import, are things any faster if you use importDump.php?
DanB
MediaWiki-l mailing list To unsubscribe, go to: https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
DanB,
Thanks for pointing out the importdump script. Indeed, the importdump script takes as long as Special:Import. I was running top and tcpdump in separate terminals during a small (484k xml), hopefully to see some oddity. But no joy. However, Apache eats a lot of CPU during the import process and I did notice resets in tcpdump, especially from upload-lb.eqiad.wikimedia.org.
How large are your XML dumps? The process is known to be pretty slow. What happens if you take the network out of the equation, i.e. import just a local file?
Saper
mediawiki-l@lists.wikimedia.org