SQL dumps take much time to import - Xmldatadumps-l

18 Apr 2016

Dear all,

I recently tried to import one of the latest SQL dumps of English 
Wikipedia titles in MySQL:

  * 2016-04-11 20:07:38 done Base per-page data (id, title, old
    restrictions, etc).
      o enwiki-20160407-page.sql.gz
        <https://dumps.wikimedia.org/enwiki/20160407/enwiki-20160407-page.sql.gz>
        1.3 GB

I think there is a performance issue that can be largely improved.
Despite I disabled the check of foreign constraints and I enabled a 
large packet size and did some other optimizations, I took around 2 days 
to do it in a laptop with 8GB of RAM.
I think I can't disable the autocommit due to the lack of RAM, but even 
without auto_commit I tried to modify some columns and it took ages.

I did a test. I edited the SQL dump file and remove the creation of 
indexes when the table is created. This was not very easy because very 
few test editors can open such a large file.
I ran the import again, without inidices and key constraints, etc. and 
it took less than two hours.
Creating the indices / key constraints after importing the tables took 
around 10-15 minutes per index.
I think with the original dump there is a problem: all the indices are 
being rebuilt after every insertion, which seems to be very inefficient.
So I would like to ask if there is any way that the dump files are 
created in a more optimized way, skipping the index creation at first 
and leaving it for the end of the script.

Best regards,

Rafael