Which dump file is offered in smaller sub files?
On Sun, Dec 19, 2010 at 6:02 PM, Platonides <Platonides(a)gmail.com> wrote:
Diederik van Liere wrote:
To continue the discussion on how to improve the
performance, would it be possible to distribute the dumps as a 7z / gz / other format
archive containing multiple smaller XML files. It's quite tricky to split a very large
XML file in smaller valid XML files and if the dumping process is already parallelized
then we do not have to cat the different XML files to one large XML file but instead we
can distribute multiple smaller parallelized files .
best,
Diederik
That has already been done for enwiki.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
<a href="http://about.me/diederik">Check out my about.me
profile!</a>