Bugs item #2886547, was opened at 2009-10-26 16:39
Message generated for change (Settings changed) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=288654…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 8
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: xqt (xqt)
Summary: pywikipedia
changes encoded tags to their plain equivalent
Initial Comment:
It was reported that the pywikipedia framework changes HTML entites into their plain
character equivalents.
See the diffs listed at
http://en.wikipedia.org/w/index.php?title=User_talk:Misza13&oldid=32214…
for why this can be a bad idea.
----------------------------------------------------------------------
Comment By: euku (eugo)
Date: 2009-11-02 15:44
Message:
I wrote a new bug report, but now I see, that someone reported it, this
should help:
wikipedia.unescape() is used in _getEditPage(), so that the bot does not
read the real page content. E.g. my archive bot changes comments unmeant.
[1] I cannot see any purpose of this function, because the bot gets what he
wants, see my demo page: source [2] <--> XML [3]
In line 785 (wikipedia.py) unescape() does the following:
- this is a test <ref>, blah blub....
+ this is a test <ref>, blah blub....
To follow this bug just use on German WP:
python get.py Benutzer:Euku/Spielwiese3
but it does not happen with:
python replace.py -page:Benutzer:Euku/Spielwiese3 "this" "sdklf"
--
[1]
http://de.wikipedia.org/w/index.php?title=Wikipedia:Bots/Anfragen/Archiv/20…
[2]
http://de.wikipedia.org/w/index.php?title=Benutzer:Euku/Spielwiese3&act…
[3]
http://de.wikipedia.org/wiki/Spezial:Exportieren/Benutzer:Euku/Spielwiese3
rv7578, 2009/11/01, 10:21:50
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2009-10-29 14:33
Message:
Very strange! I do not believe that this is part of PWRF. Do you have any
release information? Is operators script published anywhere?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2009-10-28 21:22
Message:
Re xqt: Hmm, not according to the bot operator. :\
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2009-10-27 07:54
Message:
Some of these edits looks like cosmetic changes (like cleanup headers for
instance) but it is definitely not: cc is disabled on talk pages and it
doesn't touch html entities like < and &
I guess this is not a standard python bot and these "cosmetic changes" are
implemented with that archiving script.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=288654…