On Sat, Oct 25, 2008 at 10:05 PM, Brion Vibber
<brion(a)wikimedia.org> wrote:
If you screen-scrape an HTML *user interface
form* for your bot, YOU
WILL GET BROKEN BY CHANGES, especially if you don't bother to even TRY
to behave like an actual client (which would load all the required form
variables from the edit page in the first place).
Indeed. If you are using screenscraping, you should at least bother to
run a HTML parser over the edit page. You need it anyway to get the
content and other tokens. See
<http://mwclient.svn.sourceforge.net/viewvc/mwclient/trunk/mwclient/page_nowriteapi.py?revision=45&view=markup>
for an example.
Good regexes save you the (memory) effort of an HTML parser. I
used
some really long one to give me all the fields necessary, now I use
API for edits.
Marco