dan nessett wrote:
I have imported a daily dump of 121,000 pages from a wiki running 1.13.2. The importDump took 80 hours and when I ran rebuildAll, the resulting job queue was about 250,000 jobs. It took 10.5 hours to process them. After doing this, I downloaded a subsequent daily dump (from 6 days subsequent to the original daily dump) and imported it. That took about 30 minutes.
Now I want to update the internal data structures so my personal copy of the wiki is complete. I decided not to use rebuildAll again, since the daily dump has no images. Also, the wiki is postgres based, so rebuildtextindex doesn't work (it is only for MySQL). So, I thought I would use refreshLinks. However, that is running almost as slowly as the original importDump. After processing 2200 pages and interpolating forward, I estimate it will take about 40 hours to complete.
I have several questions:
Are there any settings that might make both importDump and refreshLinks do their work more quickly?
Is it necessary to run refreshLinks after an incremental importDump?
Are there other maintenance scripts I should run?
Thanks,
Dan Nessett
importDump should be already doing everything rebuildAll does, so you don't need to run it again.