https://bugzilla.wikimedia.org/show_bug.cgi?id=69384
Bug ID: 69384 Summary: extract_templates_and_params parser bugs loading w:en:Main_Page Product: Pywikibot Version: core (2.0) Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: textlib.py Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: jayvdb@gmail.com Web browser: --- Mobile Platform: ---
Calling extract_templates_and_params with 'use_mwparserfromhell' enabled on the English Wikipedia Main Page results in many 'resuls' which are not a template
i.e. $ PYTHONPATH="." python -c "import pywikibot; pywikibot.config.use_mwparserfromhell=True; print pywikibot.extract_templates_and_params(pywikibot.Page(pywikibot.Site('en', 'wikipedia'), 'Main Page').text)"
produces:
[(u'NUMBEROFARTICLES', {}), (u'#if:{{Main Page banner}}', {u'1': u'\n<table id="mp-banner" style="width: 100%; margin:4px 0 0 0; background:none; border-spacing: 0px;">\n<tr><td class="MainPageBG" style="padding:2px 8px; background-color:#fffaf5; border:1px solid #f2e0ce; color:#000; font-size:100%;">{{Main Page banner}}\n</td></tr>\n</table>\n'}), (u'Main Page banner', {}), (u'Main Page banner', {}), (u"#ifexpr:{{formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}|R}}>150", {u'1': u"From today's featured article", u'2': u'Featured article <span style="font-size:85%; font-weight:normal;">(Check back later for today's.)</span>'}), (u"formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}", {u'1': u'R'}), (u"PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u"#ifexpr:{{formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}|R}}>150", {u'1': u"{{Wikipedia:Today's featured article/{{#time:F j, Y}}}}", u'2': u"{{Wikipedia:Today's featured article/{{#time:F j, Y|-1 day}}}}"}), (u"formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}", {u'1': u'R'}), (u"PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u"Wikipedia:Today's featured article/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u"Wikipedia:Today's featured article/{{#time:F j, Y|-1 day}}", {}), (u'#time:F j, Y', {u'1': u'-1 day'}), (u'Did you know', {}), (u'In the news', {}), (u'Wikipedia:Selected anniversaries/{{#time:F j}}', {}), (u'#time:F j', {}), (u'#switch:{{CURRENTDAYNAME}}', {u'1': u'Monday', u'2': u'', u'Friday': u'\n<table id="mp-middle" style="width:100%; margin:4px 0 0 0; background:none; border-spacing: 0px;">\n<tr>\n<td class="MainPageBG" style="width:100%; border:1px solid #f2cedd; background:#fff5fa; vertical-align:top; color:#000;">\n<table id="mp-center" style="width:100%; vertical-align:top; background:#fff5fa; color:#000;">\n<tr>\n<td style="padding:2px;"><h2 id="mp-tfl-h2" style="margin:3px; background:#f2cedd; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #bfa3af; text-align:left; color:#000; padding:0.2em 0.4em">From today's featured list</h2></td>\n</tr><tr>\n<td style="color:#000;"><div id="mp-tfl" style="padding:2px 5px;">{{#ifexist:Wikipedia:Today's featured list/{{#time:F j, Y}}|{{Wikipedia:Today's featured list/{{#time:F j, Y}}}}|{{TFLempty}}}}</div></td>\n</tr>\n</table>\n</td>\n</tr>\n</table>'}), (u'CURRENTDAYNAME', {}), (u"#ifexist:Wikipedia:Today's featured list/{{#time:F j, Y}}", {u'1': u"{{Wikipedia:Today's featured list/{{#time:F j, Y}}}}", u'2': u'{{TFLempty}}'}), (u'#time:F j, Y', {}), (u"Wikipedia:Today's featured list/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u'TFLempty', {}), (u'#ifexist:Template:POTD protected/{{#time:Y-m-d}}', {u'1': u"Today's featured picture ", u'2': u' Featured picture <span style="font-size:85%; font-weight:normal;">(Check back later for today's.)</span>'}), (u'#time:Y-m-d', {}), (u'#ifexist:Template:POTD protected/{{#time:Y-m-d}}', {u'1': u'{{POTD protected/{{#time:Y-m-d}}}}', u'2': u'{{POTD protected/{{#time:Y-m-d|-1 day}}}}'}), (u'#time:Y-m-d', {}), (u'POTD protected/{{#time:Y-m-d}}', {}), (u'#time:Y-m-d', {}), (u'POTD protected/{{#time:Y-m-d|-1 day}}', {}), (u'#time:Y-m-d', {u'1': u'-1 day'}), (u'Other areas of Wikipedia', {}), (u"Wikipedia's sister projects", {}), (u'Wikipedia languages', {}), (u'Main Page interwikis', {}), (u'noexternallanglinks', {})]
compare to when 'use_mwparserfromhell' is disabled
$ PYTHONPATH="." python -c "import pywikibot; pywikibot.config.use_mwparserfromhell=False; print repr(pywikibot.extract_templates_and_params(pywikibot.Page(pywikibot.Site('en', 'wikipedia'), 'Main Page').text))" [(u'NUMBEROFARTICLES', {}), (u'Main Page banner', {}), (u'Did you know', {}), (u'In the news', {}), (u'CURRENTDAYNAME', {}), (u'TFLempty', {}), (u'Other areas of Wikipedia', {}), (u"Wikipedia's sister projects", {}), (u'Wikipedia languages', {}), (u'Main Page interwikis', {}), (u'noexternallanglinks', {})]
https://bugzilla.wikimedia.org/show_bug.cgi?id=69384
--- Comment #1 from John Mark Vandenberg jayvdb@gmail.com --- Note that Page.botMayEdit() uses this method via Page.templatesWithParams() to look for {{nobots}}, and needs to catch an exception when it tries to instantiate a Link using these invalid 'template' names.
See comment in
https://git.wikimedia.org/blobdiff/pywikibot%2Fcore.git/7e3772cae04f95cb55b2...
https://bugzilla.wikimedia.org/show_bug.cgi?id=69384
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |upstream CC| |wikipedia.earwig@gmail.com See Also| |https://github.com/earwig/m | |wparserfromhell/issues/10 Summary|extract_templates_and_param |extract_templates_and_param |s parser bugs loading |s parser bugs loading |w:en:Main_Page |w:en:Main_Page with | |mwparserfromhell
pywikipedia-bugs@lists.wikimedia.org