Hi Alex
Thanks Daniel, API stuff is a little hard for me: the more I study, the less I edit. :-)
Just to have a try, I called the same page, "render" action gives a file of ~ 3.4 kBy, "api" action a file of ~ 5.6 kBy.
That's because the render call returns just the HTML, while the API call includes some meta-info in the XML wrapper.
Obviuosly I'm thinking to bot download. You are suggesting that it would be a good idea to use a *unlogged
- bot to avoid page parsing, and to catch the page code from some cache?
No. I'm saying that non-logged-in views of full pages are what causes the least server load. I'm not saying that this is what you should use. For one thing, it wasts bandwidth and causes additional work on your side (tripping the skin cruft).
I would recommend to use action=render if you need just the plain old html, or the API if you need a bit more control, e.g. over whether templates are resolved or not, how redirects are handled, etc.
If your bot is logged in when fetching the pages would only matter if you requested full page html. Which, as I said, isn't the best option for what you are doing. So, log in or not, it doesn't matter. But do use a distinctive and descriptive User Agent string for your bot, ideally containing some contact info http://meta.wikimedia.org/wiki/User-Agent_policy.
Note that as soon as the bot does any editing, it really should be logged in, and, depending on the wiki's rules, have a bot flag, or have some specific info on its user page.
I know that some thousands of calls are nothing for wiki servers, but... I always try to get a good performance, even from the most banal template.
That'S always a good idea :)
-- daniel