Nicolas Dumazet ha scritto:
Log Message:
Fixing the regex according to the change of HTML
I don't understand the following code too:
- regexp = re.compile('<li[^>]*>(?P<date>.+?)\s+<a href=.*?>(?P<user>.+?)</a>\s+(.+?</a>).*?<a href=".*?"(?P<new> class="new")? title=".*?"\s*>(?P<image>.+?)</a>(?:.*?<span class="comment">(?P<comment>.*?)</span>)?', re.UNICODE) + regexp = re.compile(r'(?:<li[^>]*>|<div class="mw-log-entry"[^>]*>)(?P<date>.+?)\s+<a href=.*?>(?P<user>.+?)</a>\s+(.+?</a>).*?<a href=".*?"(?P<new> class="new")? title=".*?"\s*>(?P<image>.+?)</a>(?:.*?<span class="comment">(?P<comment>.*?)</span>)?', re.UNICODE)
because I don't see "mw-log-entry" in MediaWiki source and online (http://commons.wikimedia.org/w/index.php?title=Special%3ALog&type=upload...). I see previous regexp work in Italian, English Wikipedia and Commons. Something escapes my mind. Please, let me know it.