Which dump file is offered in smaller sub files?
On Sun, Dec 19, 2010 at 6:02 PM, Platonides Platonides@gmail.com wrote:
Diederik van Liere wrote:
To continue the discussion on how to improve the performance, would it be possible to distribute the dumps as a 7z / gz / other format archive containing multiple smaller XML files. It's quite tricky to split a very large XML file in smaller valid XML files and if the dumping process is already parallelized then we do not have to cat the different XML files to one large XML file but instead we can distribute multiple smaller parallelized files .
best,
Diederik
That has already been done for enwiki.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l