Hi everyone. I'm trying to code a Wikipedia bot. The bot is going to
look at past revisions of a page to check whether vandalism was present
in past revisions. I can get the unmodified wikitext of a page using:
http://en.wikipedia.org/w/index.php?action=raw&title=User:Richardcavell
But that only works for the most recent revision. Using prop=revisions
I can get the revid of a past revision. Then if I try:
http://en.wikipedia.org/w/index.php?action=raw&revids=342742660
It gives me the main page, not the past revision of User:Richardcavell.
Doing:
http://en.wikipedia.org/w/index.php?action=raw&revids=342742660&tit…
doesn't work. It still gives me the main page. If I do:
http://en.wikipedia.org/w/api.php?action=query&format=xml&prop=revi…
then I get the past revision. It is surrounded by XML tags, and some
of the text is changed. For example, " is replaced by ".
Removing the XML tags isn't so difficult, but is prone to error. The
second issue is more annoying.
1. Is it possible for me to get the raw wikitext of a past revision,
without XML tags?
2. Is it possible for me to get the raw wikitext without the "->"
style modifications?
Richard