https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
Bug ID: 73184 Summary: redirects to other wiki erroneously lead to CircularRedirect Product: Pywikibot Version: core (2.0) Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: redirect.py Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: mpaa.wiki@gmail.com Web browser: --- Mobile Platform: ---
python scripts/redirect.py double -family:wikisource -lang:en
Retrieving special page...
The Army and Navy Hymnal/Catholic/Tantum ergo Sacramentum <<<
ERROR: Page [[en:The Army and Navy Hymnal/Catholic/Tantum ergo Sacramentum]] is a circular redirect. Skipping [[en:The Army and Navy Hymnal/Catholic/Tantum ergo Sacramentum]].
Page content is: The Army and Navy Hymnal/Catholic/Tantum ergo Sacramentum Redirect to: la:The Army and Navy Hymnal/Catholic/Tantum Ergo
The Army and Navy Hymnal/Catholic/Tantum Ergo Redirect to: la:The Army and Navy Hymnal/Catholic/Tantum Ergo
The prefix la: is not considered, so the scripts is assuming it is on the en: site.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
Fabian CommodoreFabianus@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |CommodoreFabianus@gmx.de
--- Comment #1 from Fabian CommodoreFabianus@gmx.de --- The problem is not that 'la:' is not considered but that the redirect target is not a page on that site. If you compare the request with "Main Page:English" (which is a redirect to "Main Page") you get:
http://en.wikisource.org/w/api.php?action=query&prop=pageprops&title...
While your example gets the following:
http://en.wikisource.org/w/api.php?action=query&prop=pageprops&title...
The problem there is if there is no 'pages' in the result it does treat that as a circular redirect. Problem is that a legitimate double redirect doesn't contain a 'pages' entry either:
http://test.wikipedia.org/w/api.php?action=query&prop=pageprops&titl...
So I think the only way is to have the comparison more intelligent and only declare it a circular redirect if pywikibot does go over each redirect and finds a already seen page. Or use the link parser to detect if it's an interwiki link.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
--- Comment #2 from John Mark Vandenberg jayvdb@gmail.com --- [[s:en:The Army and Navy Hymnal/Catholic/Tantum ergo Sacramentum]] is an interlang redirects.
And this is why interwikis are, by policy but not software, not allowed in redirects, and should be replaced with {{soft redirect}}. The mediawiki bug for this is bug 39492.
I've lower the priority because the script skips this page, and should skip this page - the output is wrong.
This problem dates back to the original 'core' function getredirtarget, line 547-8
http://git.wikimedia.org/blobdiff/pywikibot%2Fcore.git/852973b62c9d597db5fb0...
if "pages" not in result['query']: # no "pages" element indicates a circular redirect raise pywikibot.CircularRedirect....
The result doesnt have pages, because en.ws cant return content that is on another wiki - la.ws.
{"query":{"redirects":[{"from":"The Army and Navy Hymnal/Catholic/Tantum ergo Sacramentum","to":"la:The Army and Navy Hymnal/Catholic/Tantum Ergo"}],"userinfo":{"id":10823,"name":"JVbot"}}}
There is a chance that other text in a #redirect might also cause the same error, but I doubt it.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|Unprioritized |Low CC| |jayvdb@gmail.com See Also| |https://bugzilla.wikimedia. | |org/show_bug.cgi?id=39492 Severity|normal |minor
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
--- Comment #3 from Mpaa mpaa.wiki@gmail.com --- Jayvdb, it ii not clear to me if this is a bug or a mistake of who created the redirect to another site.
If the first case, I have a fix that geets as redirect the page on the other site. In the second case, we should close the bug.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
--- Comment #4 from John Mark Vandenberg jayvdb@gmail.com --- @Mpaa, well mediawiki has limited support for interwiki links in redirects. It may even be a LocalSettings.php config variable to make interwiki links not automatically redirect.
Pywikibot needs to detect them, and .. either access them normally or raise an appropriate exception. InterwikiRedirectPage subclass of PageRelatedError.
I'd prefer that we consider it an invalid page and raise an exception, because I expect there is a lot of code which believes that a redirect target will always be on the same site as the redirect. The API cant 'follow redirects' if the redirect is an interwiki link.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
--- Comment #5 from Mpaa mpaa.wiki@gmail.com --- Patch: Change-Id: Ia0d4dadf713fb97572c5d482485858331bda5ea8
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
Mpaa mpaa.wiki@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |PATCH_TO_REVIEW
https://bugzilla.wikimedia.org/show_bug.cgi?id=73184
--- Comment #6 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 172176 had a related patch set uploaded by Mpaa: Introduce InterwikiRedirectPage
https://gerrit.wikimedia.org/r/172176
pywikipedia-bugs@lists.wikimedia.org