Hi everyone. I'm trying to code a Wikipedia bot. The bot is going to
look at past revisions of a page to check whether vandalism was present
in past revisions. I can get the unmodified wikitext of a page using:
http://en.wikipedia.org/w/index.php?action=raw&title=User:Richardcavell
But that only works for the most recent revision. Using prop=revisions
I can get the revid of a past revision. Then if I try:
http://en.wikipedia.org/w/index.php?action=raw&revids=342742660
It gives me the main page, not the past revision of User:Richardcavell.
Doing:
http://en.wikipedia.org/w/index.php?action=raw&revids=342742660&titles=User…
doesn't work. It still gives me the main page. If I do:
http://en.wikipedia.org/w/api.php?action=query&format=xml&prop=revisions&rv…
then I get the past revision. It is surrounded by XML tags, and some
of the text is changed. For example, " is replaced by ".
Removing the XML tags isn't so difficult, but is prone to error. The
second issue is more annoying.
1. Is it possible for me to get the raw wikitext of a past revision,
without XML tags?
2. Is it possible for me to get the raw wikitext without the "->"
style modifications?
Richard
Hi, everyone. I'm coding a Wikipedia bot. The MediaWiki API
documentation isn't helping me much. I'd like to know:
I want to download the most recent version of a page that was written
by an author other than the author of the most recent version. (ie
what it looked like before the current author). I want to compare
versions, so that if vandalism has occurred in the most recent author's
version but not the previous author's version, my bot can rollback.
Presently my body downloads the wikitext of the page using index.php
rather than api.php. How do I download the wikitext of the previous
version ?
Richard
Hi everyone.
I'm trying to code a wikipedia bot, building the framework myself. I
want to understand how to interpret the results of a request to the
Edit MediaWiki API. So far my code does this :
Interpret HTTP status code.
if it's 503, sleep then try again
if it's 504, sleep then try again
if it's not 200, abort giving an error code
Test for an API Error
if it's maxlag, sleep then try again
if it's editconflict, restart the whole test-edit sequence
if it's success, return success
abort, giving an error code
1. Are there any common replies that I am not accounting for with the
above code?
2. How do I best test for an API Error? Should I search the header
for "MediaWiki-API-Error:" and use the rest of the string as the code?
Or should I test the body? I'm working in XML.
3. Similarly, how do I best test for success?
TIA,
Richard
Hi everyone. Looking at this page:
http://www.mediawiki.org/wiki/API:Edit
There are many parameters that can be set when sending an HTTP POST
with the intention of doing an edit. Do all of the parameters go in
the URL? Can I shift some of them into the HTTP POST in the body of
the request?
In particular, where does the new text go? Does that go as a form item
in the body of the request?
Richard
Hi all.
I'm coding a Wikipedia bot framework (not using any existing framework)
for use on en-wiki. I'm a little confused by the documentation of edit
tokens. Which is the correct approach:
1. My bot obtains an edit token using the Main page as a nominal
title, saves the edit token and then uses it in every subsequent
attempt to edit.
or
2. My bot queries each page to obtain an edit token before attempting
to edit.
Richard (TIA)
Hello,
How can I fetch images from Wikimedia Commons ? I'm able to get a list of all images in an article through this query:
http://en.wikipedia.org/w/api.php?action=parse&page=Norway&prop=images&form…
What query should I use to fetch thumbnails and full size images of listed image titles ?
Regards,
Siteshwar Vashisht