Dear all,
I recently tried to import one of the latest SQL dumps of English
Wikipedia titles in MySQL:
- 2016-04-11 20:07:38
done Base
per-page data (id, title, old restrictions, etc).
I think there is a performance issue that can be largely improved.
Despite I disabled the check of foreign constraints and I enabled a
large packet size and did some other optimizations, I took around 2
days to do it in a laptop with 8GB of RAM.
I think I can't disable the autocommit due to the lack of RAM, but
even without auto_commit I tried to modify some columns and it took
ages.
I did a test. I edited the SQL dump file and remove the creation of
indexes when the table is created. This was not very easy because
very few test editors can open such a large file.
I ran the import again, without inidices and key constraints, etc.
and it took less than two hours.
Creating the indices / key constraints after importing the tables
took around 10-15 minutes per index.
I think with the original dump there is a problem: all the indices
are being rebuilt after every insertion, which seems to be very
inefficient.
So I would like to ask if there is any way that the dump files are
created in a more optimized way, skipping the index creation at
first and leaving it for the end of the script.
Best regards,
Rafael