The latest enwiki pages dump of enwiki-latest-pages-articles.xml.bz2 in
http://dumps.wikimedia.org/enwiki/latest/ is only 5.8 GB.
Previous versions, e.g.
http://dumps.wikimedia.org/enwiki/20110526/ and
http://dumps.wikimedia.org/enwiki/20110405/
have been consistently around 6.7-6.8GB.
I saw this after noticing that many pages are missing from the newest dump,
e.g.
http://en.wikipedia.org/wiki/Liar_Liar and
http://en.wikipedia.org/wiki/Juan_que_re%C3%ADa.
Is this a known problem? Can anything be done to prevent this in the
future?
Thanks,
Eric