Hi Everybody,
I use mwdumper to import the latest current xml dump enwiki-20101011-pages-meta-current.xml.bz2 to my mediawiki. Everything seems fine, however, i found that only 6,669,091 pages in the database, while the mwdumper stops working and exit at the number 21,894,705.
I am not sure if i have successfully imported all the current pages into mediawiki. Is there any method for me to verify that? Is there any data on pages for each dumps for cross referencing purpose? Any method for me to track what error has encountered (other than viewing the huge log file)?
On the other hand, i found that the parsing efficiency drops from time to time during the import process. It drops from (345.12/sec) to (79.125/sec). Is it a normal phenomenon? Any method for me to boost this performance? The strange part is this figure rise again to around (200/sec) after the 6mil something page is imported (maybe due to nothing is inserted to the DB anymore).
Any sharing of thoughts would be appreciated. Thank you.
wikitech-l@lists.wikimedia.org