Bugs item #2886547, was opened at 2009-10-26 16:39 Message generated for change (Settings changed) made by xqt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2886547...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None
Priority: 8
Private: No Submitted By: Nobody/Anonymous (nobody)
Assigned to: xqt (xqt)
Summary: pywikipedia changes encoded tags to their plain equivalent
Initial Comment: It was reported that the pywikipedia framework changes HTML entites into their plain character equivalents.
See the diffs listed at http://en.wikipedia.org/w/index.php?title=User_talk:Misza13&oldid=322148... for why this can be a bad idea.
----------------------------------------------------------------------
Comment By: euku (eugo) Date: 2009-11-02 15:44
Message: I wrote a new bug report, but now I see, that someone reported it, this should help:
wikipedia.unescape() is used in _getEditPage(), so that the bot does not read the real page content. E.g. my archive bot changes comments unmeant. [1] I cannot see any purpose of this function, because the bot gets what he wants, see my demo page: source [2] <--> XML [3] In line 785 (wikipedia.py) unescape() does the following: - this is a test <ref>, blah blub.... + this is a test <ref>, blah blub....
To follow this bug just use on German WP: python get.py Benutzer:Euku/Spielwiese3 but it does not happen with: python replace.py -page:Benutzer:Euku/Spielwiese3 "this" "sdklf"
-- [1] http://de.wikipedia.org/w/index.php?title=Wikipedia:Bots/Anfragen/Archiv/200... [2] http://de.wikipedia.org/w/index.php?title=Benutzer:Euku/Spielwiese3&acti... [3] http://de.wikipedia.org/wiki/Spezial:Exportieren/Benutzer:Euku/Spielwiese3
rv7578, 2009/11/01, 10:21:50
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2009-10-29 14:33
Message: Very strange! I do not believe that this is part of PWRF. Do you have any release information? Is operators script published anywhere?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2009-10-28 21:22
Message: Re xqt: Hmm, not according to the bot operator. :\
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2009-10-27 07:54
Message: Some of these edits looks like cosmetic changes (like cleanup headers for instance) but it is definitely not: cc is disabled on talk pages and it doesn't touch html entities like < and &
I guess this is not a standard python bot and these "cosmetic changes" are implemented with that archiving script.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2886547...