[Pywikipedia-l] Where is the error?
Marcin Cieslak
saper at saper.info
Fri Jun 24 13:34:32 UTC 2011
>> Bináris <wikiposta at gmail.com> wrote:
>> http://dumps.wikimedia.org/huwiki/latest/huwiki-latest-pages-articles.xml=
> .bz2
>
> FilterPageGenerator
> for page in generator:
> File "replace.py", line 259, in __iter__
> for entry in self.parser:
> File "C:\Program Files\Pywikipedia\xmlreader.py", line 313, in new_parse
> for event, elem in context:
> File "<string>", line 68, in __iter__
> SyntaxError: no element found: line 27524494, column 9
> no element found: line 27524494, column 9
Yes, those dumps are broken indeed.
This is because you are using experimental extension (LiquidThreads) on your wiki
and this extension is unable to properly dump some fancy characters. This page
is causing trouble:
https://secure.wikimedia.org/wikipedia/hu/wiki/Téma:Szerkesztővita:Dencey/Fölösleges_információk/válasz_(3)
or, to be more specific, this:
https://secure.wikimedia.org/wikipedia/hu/wiki/Speciális:Lapok_exportálása/Téma:Szerkesztővita:Dencey/Fölösleges_információk/válasz_(3)
I have filed this bug for you:
https://bugzilla.wikimedia.org/show_bug.cgi?id=29564
but it seems this not a first time when LiquidThreads breaks dumps
(https://bugzilla.wikimedia.org/show_bug.cgi?id=22688).
//Marcin
More information about the Pywikipedia-l
mailing list