https://bugzilla.wikimedia.org/show_bug.cgi?id=55307
Web browser: --- Bug ID: 55307 Summary: Section headers with templates are not correctly recognised Product: Pywikibot Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: interwiki.py Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: legoktm.wikipedia@gmail.com Classification: Unclassified Mobile Platform: ---
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/914/ Reported by: Anonymous user Created on: 2009-04-20 16:13:14 Subject: Section headers with templates are not correctly recognised Original description: Sometimes bot removes valid interwiki which lads to anchor in other article. See http://cs.wikipedia.org/w/index.php?title=Platnost%5C_%5C(pr%C3%A1vo%5C)&...
https://bugzilla.wikimedia.org/show_bug.cgi?id=55307
--- Comment #1 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- valhallasw@dorthonion:~/src/pywikipedia/trunk$ python interwiki.py cs:Platnost_%28právo%29 Getting 1 page from wikipedia:cs... [[cs:Platnost (právo)]]: [[cs:Platnost (právo)]] gives new interwiki [[de:Gültigkeit#Gültigkeit im Recht]] Getting 1 page from wikipedia:de... NOTE: [[de:Gültigkeit#Gültigkeit im Recht]] does not exist. Skipping. ======Post-processing [[cs:Platnost (právo)]]====== Updating links on page [[cs:Platnost (právo)]]. Changes to be made: Robot: Removing [[de:Gültigkeit#Gültigkeit im Recht]] - [[de:Gültigkeit#Gültigkeit im Recht]]
ERROR: Found incorrect link to de in [[cs:Platnost (právo)]] Submit? ([y]es, [n]o, open in [b]rowser, [g]ive up, [a]lways)
tries to refer to === {{Anker|Rechtsg\xfcltig}}G\xfcltigkeit im Recht ===\
using #G.C3.BCltigkeit_im_Recht instead does not help
removing {{Anker|...}} does work, if the #Gültigkeit im Recht version is used...
https://bugzilla.wikimedia.org/show_bug.cgi?id=55307
--- Comment #2 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **labels**: --> interwiki - **milestone**: --> confirmed - **priority**: 5 --> 6
https://bugzilla.wikimedia.org/show_bug.cgi?id=55307
--- Comment #3 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **summary**: Removing of interwiki to anchor --> Section headers with templates are not correctly recognised
https://bugzilla.wikimedia.org/show_bug.cgi?id=55307
--- Comment #4 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Based on the wikitext, it's hard to determine whether the section title is there (evil regexp). We could combine it with a fallback to the API with action=parse, i.e. http://de.wikipedia.org/w/api.php?action=parse&text=%5C%7B%5C%7B:G%C3%BC...
the rewrite doesn't raise SectionErrors altogether.
options: - stripping the check - try to get the regexp working - keep a simple regexp with an API fallback
questions - how to implement this in the rewrite?
https://bugzilla.wikimedia.org/show_bug.cgi?id=55307
--- Comment #5 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Long story short: it's impossible to do without another API query. We can use
http://en.wikipedia.org/w/api.php?action=parse&prop=sections&page=He...
to do this. We cannot do it using regexps because of template expansions.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55307
Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://sourceforge.net/p/p | |ywikipediabot/bugs/914
pywikipedia-bugs@lists.wikimedia.org