Yes, it's a known problem; you should be able to download the pieces
instead; yes code is being tested to detect truncated files and flag
them. In the meantime I have to do some other testing to see whether
we're running into some constraint running this many jobs at once, which
causes the bzips to die off or be killed off.
Στις 05-07-2011, ημέρα Τρι, και ώρα 14:09 -0700, ο/η Eric Sun έγραψε:
The latest enwiki pages dump of
is only 5.8 GB.
Previous versions, e.g.
have been consistently around 6.7-6.8GB.
I saw this after noticing that many pages are missing from the newest
dump, e.g. http://en.wikipedia.org/wiki/Liar_Liar
Is this a known problem? Can anything be done to prevent this in the
Xmldatadumps-l mailing list