Hi,
for a few weeks now I try to setup a local wikipedia-Server. For some science reasons we need to import the german wikipedia with all history information. I downloaded the compressed XML file, and used mwimport to convert the XML data to a sql command file. The resulting file is about 500 MB and contains about 50 million lines of sql commands.
I pumped this file into a mysql command. The mysql server is running on the same machine, 2GB of memory. I configured mysql to use 1.5 GB as buffer for innodb (innodb_buffer_pool_size).
The mysql command runs for 2 weeks now and imported about 20 million of 29 million article revisions. This seems to be extremly slow for me. I think I must be doing something wrong, but I cannot find a mistake.
A simple counting command takes between 3 and 7 hours:
mysql> select count(rev_id) from revision; +---------------+ | count(rev_id) | +---------------+ | 20923026 | +---------------+ 1 row in set (7 hours 5 min 53.40 sec)
How can that be? Any ideas how to improve the performance? Thanks a lot in advance!