https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
Web browser: --- Bug ID: 55165 Summary: Wikia returns cached pages for get.py editarticle.py Product: Pywikibot Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: General Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: legoktm.wikipedia@gmail.com Classification: Unclassified Mobile Platform: ---
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1537/ Reported by: throwy Created on: 2012-11-07 12:36:40 Subject: Wikia returns cached pages for get.py editarticle.py Original description: get.py and editarticle.py use a method of page fetching that results in cached pages from Wikia replace.py uses the pagegenerator method, which fetches the latest version of pages from Wikia
The issue is probably a Wikia issue, but it would be nice to implement a workaround in pywikipediabot.
Steps to reproduce: Create or edit a page on a Wikia wiki. Fetch the page with editarticle.py or get.py . The bot should fetch a cached version. Edit the page with replace.py and the bot should fetch the most recent version, which is the expected behavior.
Comments: Someone had already solved this issue for me on #pywikipediabot on freenode. It requires very little alteration to get.py and editarticle.py. Unfortunately I did not back up or document the changes before updating pywikipediabot from SVN and the changes were lost.
----
$ python version.py Pywikipedia [http] trunk/pywikipedia (r10663, 2012/11/04, 19:53:31) Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] config-settings: use_api = True use_api_login = True unicode test: ok
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #1 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Some remarks: I changed the hostname() in family file to "mlp.wikia.com" and used the following statements:
import wikipedia as wp s = wp.getSite('wikia', 'wikia') p = wp.Page(s, 'Template:Date/doc') t = p.get(force=True)
result: Traceback (most recent call last): File "<pyshell#69>", line 1, in <module> t = p.get(force=True) File "wikipedia.py", line 699, in get expandtemplates = expandtemplates) File "wikipedia.py", line 800, in _getEditPage "Page does not exist. In rare cases, if you are certain the page does exist, look into overriding family.RversionTab") NoPage: (wikia:wikia, u'[[wikia:Template:Date/doc]]', 'Page does not exist. In rare cases, if you are certain the page does exist, look into overriding family.RversionTab')
the query param dict was: {'inprop': ['protection', 'subjectid'], 'rvprop': ['content', 'ids', 'flags', 'timestamp', 'user', 'comment', 'size'], 'prop': ['revisions', 'info'], 'titles': u'Template:Date/doc', 'rvlimit': 1, 'action': 'query'}
the result data dict was: {u'query': {u'pages': {u'-1': {u'protection': [], u'ns': 10, u'missing': u'', u'title': u'Template:Date/doc'}}}}
and last the url is: /api.php?inprop=protection%7Csubjectid&format=json&rvprop=content%7Cids%7Cflags%7Ctimestamp%7Cuser%7Ccomment%7Csize&prop=revisions%7Cinfo&titles=Template%3ADate/doc&rvlimit=1&action=query
which gives the right result via browser e.g.: http://mlp.wikia.com/api.php?inprop=protection%7Csubjectid&format=json&a...
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #2 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- I found the patched editarticle.py on pastebin, woohoo!
<pre>33a34 > import pagegenerators 157c158 < self.page = pywikibot.Page(site, pageTitle) --- > self.page = iter(pagegenerators.PreloadingGenerator([pywikibot.Page(site, pageTitle)])).next()</pre>
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #3 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **milestone**: --> trunk
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #4 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- diff of editarticle.py with working pagegenerators fetching
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #5 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- diff of get.py with working pagegenerators fetching
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #6 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Yes this path retrieves the page content via special:import instead of API because API bulk call is not approved for the trunk release. Thus this patch wouldn't work for rewrite branch.
Anyway it is not clear for me, why the api returns the data by browser call but not via bot frameworks query.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #7 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- changed the hostname() in family file to "mlp.wikia.com" and used the following statements:
import wikipedia as wp s = wp.getSite('wikia', 'wikia') p = wp.Page(s, 'Template:Date/doc') t = p.get(force=True)
works for me.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
--- Comment #8 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- adding version info: Pywikipedia [https] r/pywikibot/compat (r10308, a208b54, 2013/09/24, 09:51:19, ok) Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] config-settings: use_api = True use_api_login = True unicode test: ok
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://sourceforge.net/p/p | |ywikipediabot/bugs/1537
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
Nemo federicoleva@tiscali.it changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |tim@wikia-inc.com
--- Comment #9 from Nemo federicoleva@tiscali.it --- Was this ever reported to Wikia? If not please write to community AT wikia.com
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jayvdb@gmail.com Version|unspecified |compat (1.0)
https://bugzilla.wikimedia.org/show_bug.cgi?id=55165
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Component|General |Other scripts
pywikipedia-bugs@lists.wikimedia.org