[Mediawiki-l] Extremely slow import of pagelinks

Benjamin Lees emufarmers at gmail.com
Mon Jun 22 22:54:20 UTC 2009


On Mon, Jun 22, 2009 at 3:27 AM, Rolf Lampa [RIL] <rolf.lampa at rilnet.com>wrote:

> Hi all,
>
> I experience extremely slow import of pagelinks.sql (MYISAM). It's only
> about 13 000 000 rows of data so it shouldn't take many minutes. Now it
> takes hours.
>
> I noticed that time for inserts increase as the table grows. Peeking
> with SHOW PROGRESSLIST sometimes says:
>
> +------+------------+--------
> | time | State      | Info
> | 38   | ** DEAD ** | INSERT INTO 'pagelinks' ...
> +------+------------+--------------------------------
>
> I thought it'd be enough to shut off keys during import etc like so:
>
> SET AUTOCOMMIT=0;
> SET UNIQUE_CHECKS=0;
> SET FOREIGN_KEY_CHECKS=0;
> ALTER TABLE `pagelinks` DISABLE KEYS;
>
> ...and after import activate them again like so:
>
> SET UNIQUE_CHECKS=1;
> SET FOREIGN_KEY_CHECKS=1;
> ALTER TABLE `pagelinks` ENABLE KEYS;
> COMMIT;
>
> Since it took so long I aborted the import of tables (tried first to
> import in alphabethical order) and imported all the other tables first,
> and after that I import pagelinks.
>
> I have a dual core with 2GB ram, xampplite on windows.
>
> What am I doing wrong? Does import order really matter that much? (I'da
> already tried all tricks if it only didn't take soooo loooooong to try
> all 'em tricks when its slow like this).
>
> Any ideas about how to reduce time /significantly/ from hours down to
> minutes? I use this data for testing ("consuming" it by processing it)
> and thus I need to often "reinstall" the data, in several databases.
>
> // Rolf Lampa
>
Kind of sidestepping your question, but I think you can just rebuild the
pagelinks table with refreshLinks.php.  I can't say for sure whether that
would actually be faster, though.


More information about the MediaWiki-l mailing list