https://bugzilla.wikimedia.org/show_bug.cgi?id=54548
Web browser: --- Bug ID: 54548 Summary: _getUserDataOld call from low-level getUrl Product: Pywikibot Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: General Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: legoktm.wikipedia@gmail.com Classification: Unclassified Mobile Platform: ---
Originally from: http://sourceforge.net/p/pywikipediabot/patches/608/ Reported by: valhallasw Created on: 2013-04-13 20:51:59 Subject: _getUserDataOld call from low-level getUrl Original description: From http://lists.wikimedia.org/pipermail/pywikipedia-l/2012-October/007585.html :
I just wanted to put() a simple page on a MediaWiki 1.16 instance, where I have to use screen scraping (use_api=False).
There is something strange however:
There is an API call invoked by _getBlocked:
/w/api.php?action=query&format=json&meta=userinfo&uiprop=blockinfo
Here's my backtrace:
File "pywikipedia/wikipedia.py", line 693, in get expandtemplates = expandtemplates)
File "pywikipedia/wikipedia.py", line 743, in _getEditPage return self._getEditPageOld(get_redirect, throttle, sysop, oldid, change_edit_time)
File "pywikipedia/wikipedia.py", line 854, in _getEditPageOld text = self.site().getUrl(path, sysop = sysop)
File "pywikipedia/wikipedia.py", line 5881, in getUrl self._getUserDataOld(text, sysop = sysop)
File "pywikipedia/wikipedia.py", line 6016, in _getUserDataOld blocked = self._getBlock(sysop = sysop)
File "pywikipedia/wikipedia.py", line 5424, in _getBlock data = query.GetData(params, self)
File "pywikipedia/query.py", line 146, in GetData jsontext = site.getUrl( path, retry=True, sysop=sysop, data=data)
getUrl(), which is also called from API, seems always to call _getUserDataOld(text) where text is ... API output so it tries to do strange things on that and gives warnings like
Note: this language does not allow global bots. WARNING: Token not found on wikipedia:pl. You will not be able to edit any page.
which is nonsense since the analyzed text is not HTML - only API output.
If getUrl() is supposed to be a low-level call, why call _getUserDataOld() there?
http://www.mediawiki.org/wiki/Special:Code/pywikipedia/7461
has introduced this call there.
It's easily reproducable by this:
import wikipedia import config config.use_api = False wikipedia.verbose = True s = wikipedia.getSite("pl", "wikipedia") p = wikipedia.Page(s, u"User:Saper") c = p.get() c += "<!-- test -->" p.put(c, u"Testing wiki", botflag=False)
//Saper