[Pywikipedia-l] Encoding in HTML source
Bináris
wikiposta at gmail.com
Mon Mar 7 15:26:03 UTC 2011
I send this back to the list:
2011/3/7 Andre Engels <andreengels at gmail.com>
> I don't know about that, but I think you can work the other way
> around, using a bit of regular expression magic:
>
> import re
> ...
> existing = [wikipedia.Page(wikipedia.getSite(), pname).title() for
> pname in re.findall(r"title=(.*?)&action=edit", fullsourcetext)]
>
> def exists(page):
> return page.title() in existing
>
This works fine! I didn't know that titles could be encoded in Page().
There are already some regexes in my code. Thank you!
--
Bináris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikimedia.org/pipermail/pywikipedia-l/attachments/20110307/39b18c91/attachment.htm
More information about the Pywikipedia-l
mailing list