Dear Ariel,

I have begun to make use of the Incremental XML Data Dumps, and have a few questions.

0) Acronymns

For brevity, I shall coin two terms:

xdump - XML Data Dump
xincr - Incremental XML Data Dump

1) Checksums

The checksum files for the `xincr's is not formatted correctly, causing `md5sum' to throw an error. The correct format is:

<checksum><two spaces><filename><newline>

(shell)$ cat simplewiki-20140703-md5sums.txt
d03f3a91ef0273eb814f39a1d13788cb
c51f2bd5ef6bd42ce65cf4a7fca72400

(shell)$ md5sum --check simplewiki-20140703-md5sums.txt
md5sum: simplewiki-20140703-md5sums.txt: no properly formatted MD5 checksum lines found

2) maintenance/importDump.php

Whereas no incremental SQL files are provided, I cannot use `mwxml2sql' and must instead use `importDump.php'. However, I have encountered a few issues when using `importDump.php' on `xincr's.

2.1) Speed: Importation proceeds at less than 0.1 pages/sec. This means that, for the largest wikis (commonswiki, enwiki, wikidatawiki) importation cannot be completed before the `xincr' for the next day is posted.

2.2) Pauses: Normally, when running `top', I can see at least on CPU at near 100% for `php' and `mysql'.  However, sometimes importation pauses for several minutes, with no apparent CPU or disk activity. I assume that there is a time-out somewhere that allows importation to proceed again. Any comments on this phenomenon would be most welcome.

2.3) Fails: Sometimes importation fails. I see this often with the `xincr's from `betawikiversity'. I have not yet isolated specific records that cause failure. But it raises the question: Is `importDump.php' still supported?

3) Tools

Can you please advise as to the best method for importing `xincr's?
Is there another importation tool that you would recommend (one that is both supported and fast)?

Sincerely Yours,
Kent