Again, the link is the same, but the dump changed to 14 June.
2011/6/8 Bináris
<wikiposta@gmail.com>
Sorry. This one is suspicious:
http://dumps.wikimedia.org/huwiki/latest/huwiki-latest-pages-articles.xml.bz2
My screen is:
Traceback (most recent call last):
File "C:\Program Files\Pywikipedia\pagegenerators.py", line 1255, in __iter__
for page in self.wrapped_gen:
File "C:\Program Files\Pywikipedia\pagegenerators.py", line 1113, in Namespace
FilterPageGenerator
for page in generator:
File "C:\Program Files\Pywikipedia\pagegenerators.py", line 1157, in Duplicate
FilterPageGenerator
for page in generator:
File "replace.py", line 259, in __iter__
for entry in self.parser:
File "C:\Program Files\Pywikipedia\xmlreader.py", line 313, in new_parse
for event, elem in context:
File "<string>", line 68, in __iter__
SyntaxError: no element found: line 27524494, column 9
no element found: line 27524494, column 9
59 titles were saved.
I used replace.py with 3 different fixes, the result and the numbers are always the same. It stops somewhere around May 2011, but maybe it stops at the end. Is it possible, that every occurence is found, and I get the error message at the end of dump? I can only check the creation date of the last found article, and this is April.May of this year, so I am not sure anything is left out.