Perl MediaWiki::API module

List overview All Threads
Download

newer

older

advanced search method

BREAKING CHANGE: more useful...

Jools Smyth

24 Jun 2008 24 Jun '08

8:05 p.m.

Hi,

I hope this is allowed here, but I have recently written a Perl module to interface with the MediaWiki API, and am after users to try it and let me know what works/doesn't etc.

The module is not on cpan yet (Awaiting an account), but can be downloaded from http://www.exotica.org.uk/wiki/User:BuZz/MediaWiki::API

Thanks for listening!

Best Regards

Jools

Show replies by date

Carl Beckhorn

26 Jun 26 Jun

10:12 a.m.

On Tue, Jun 24, 2008 at 01:05:19PM +0100, Jools Smyth wrote:

...

I hope this is allowed here, but I have recently written a Perl module to interface with the MediaWiki API, and am after users to try it and let me know what works/doesn't etc.

The module is not on cpan yet (Awaiting an account), but can be downloaded from http://www.exotica.org.uk/wiki/User:BuZz/MediaWiki::API

Ironically, I also have a Perl module named Mediawiki::API at

svn co svn://hemlock.ts.wikimedia.org/cbm/subversion/mediawiki-api

Mine is somewhat more complete in terms of queries, but I don't have editing support written yet.

Perhaps we can work together to make a single library. I have been planning to work on rewriting mine (it was originally proof-of-concept code, and significant reorganization is needed). My library is field-tested in VeblenBot and the WP 1.0 bot on enwiki.

I'll be glad to dual-license my code in both GPL and the Perl license.

I noticed your library is not very robust about handling errors from the API. I have found through experience that the following error situations have to be expected and handled when dealing with the API:

0. The server may be temporarily unavailable, or the DNS may temporarily fail to resolve. This is true for any web-based service. The query has to be repeated if the server is temporarily unavailable.

1. The squids may return an error message, which is marked by an HTTP 'x-squid-error' header. This means the query must be repeated.

2. You will quite routinely get a 0 byte response with a sucessful HTTP status code. This means the query must be repeated.

3. You need to use a maxlag= parameter, and detect when the maxlag is exceeded. If it is exceeded, the query must be repeated after the appropriate delay.

4. Rarely, you may get a truncated XML response, which causes the XML parser to throw an exception. This means the query must be repeated.

You can see this error handling in the poorly named 'makeHTMLrequest' and 'makeXMLrequest' functions in my API.pm. These should be 'makeHTTPrequest' and 'makeXMLrequest', respectively.

- Carl

Jools Smyth

12:48 p.m.

...

Mine is somewhat more complete in terms of queries, but I don't have editing support written yet.

Also, I think we are tackling it in slightly different ways in that you have the higher level helper functions for various API calls which I don't have, which mine expects the client code to do themselves (but i provide a generic "list" helper function in a similar way to your fetchWithContinuation.). I am constantly in two minds about how much to implement in the module itself, as I don't want a small api change to mean we need to update the module (I would prefer that the client code is instead changed)

...

Perhaps we can work together to make a single library. I have been planning to work on rewriting mine (it was originally proof-of-concept code, and significant reorganization is needed). My library is field-tested in VeblenBot and the WP 1.0 bot on enwiki.

Sure. Sounds like a good idea. Our bot was written to take over the handling of http://www.exotica.org.uk/wiki/UnExoticA which is currently using an older perl module. I've almost switched it over now to use my new module.

...

I'll be glad to dual-license my code in both GPL and the Perl license.

actually I was going to make mine GPL, I just haven't edited the license file etc. so I'm happy to go GPL also.

...

The server may be temporarily unavailable, or the DNS may temporarily fail to resolve. This is true for any web-based service. The query has to be repeated if the server is temporarily unavailable.

I don't have any such stuff and I guess I should. Down to me making the bot primarily to replace some code we were using before, which we run on our own wiki (and on the server machine itself).

...

The squids may return an error message, which is marked by an HTTP 'x-squid-error' header. This means the query must be repeated.

You will quite routinely get a 0 byte response with a sucessful HTTP status code. This means the query must be repeated.

You need to use a maxlag= parameter, and detect when the maxlag is exceeded. If it is exceeded, the query must be repeated after the appropriate delay.

Rarely, you may get a truncated XML response, which causes the XML parser to throw an exception. This means the query must be repeated.

I assume these are all connected with Wikipedia as I never came across them on our installation. I will look at implementing this stuff though. I guess a config option with a retry count might be an idea.

...

You can see this error handling in the poorly named 'makeHTMLrequest' and 'makeXMLrequest' functions in my API.pm. These should be 'makeHTTPrequest' and 'makeXMLrequest', respectively.

Thanks. Will take a close look later (Since it is 5:47 am now and I really should not be awake).

Best Regards

Jools

6002

Age (days ago)

6004

Last active (days ago)

mediawiki-api@lists.wikimedia.org

2 comments

2 participants

tags (0)

participants (2)

Carl Beckhorn
Jools Smyth