Again, the link is the same, but the dump changed to 14 June.

2011/6/8 Bináris <wikiposta@gmail.com>
Sorry. This one is suspicious:
http://dumps.wikimedia.org/huwiki/latest/huwiki-latest-pages-articles.xml.bz2

My screen is:
Traceback (most recent call last):
  File "C:\Program Files\Pywikipedia\pagegenerators.py", line 1255, in __iter__
    for page in self.wrapped_gen:
  File "C:\Program Files\Pywikipedia\pagegenerators.py", line 1113, in Namespace
FilterPageGenerator
    for page in generator:
  File "C:\Program Files\Pywikipedia\pagegenerators.py", line 1157, in Duplicate
FilterPageGenerator
    for page in generator:
  File "replace.py", line 259, in __iter__
    for entry in self.parser:
  File "C:\Program Files\Pywikipedia\xmlreader.py", line 313, in new_parse
    for event, elem in context:
  File "<string>", line 68, in __iter__
SyntaxError: no element found: line 27524494, column 9
no element found: line 27524494, column 9
59 titles were saved.

I used replace.py with 3 different fixes, the result and the numbers are always the same. It stops somewhere around May 2011, but maybe it stops at the end. Is it possible, that every occurence is found, and I get the error message at the end of dump? I can only check the creation date of the last found article, and this is April.May of this year, so I am not sure anything is left out.



--
Bináris