Hi,
I noted that MWclient uses, or tries to use, persistent HTTP connections to the API.
I have a few questions, as I am looking at improving Interwicket, which makes lots of API calls, and is sometimes driven nuts by the two layers of proxies (here, on my side of the net) that are prone to failing ...
Is using persistent connections a good idea?
What is the server-side timeout when a connection is idle?
Can I make requests to various different projects (xx.wiktionary, for various xx) on the same connection (looks like it, as they all return the same address)?
Just thinking about what would be best if I re-write a layer on my side (;-)
best, Robert
2010/6/1 Robert Ullmann rlullmann@gmail.com:
Hi,
I noted that MWclient uses, or tries to use, persistent HTTP connections to the API.
I have a few questions, as I am looking at improving Interwicket, which makes lots of API calls, and is sometimes driven nuts by the two layers of proxies (here, on my side of the net) that are prone to failing ...
Is using persistent connections a good idea?
For requests made in rapid succession, definitely. For requests that are further apart, maybe.
What is the server-side timeout when a connection is idle?
No idea. You could find out by trying :)
Can I make requests to various different projects (xx.wiktionary, for various xx) on the same connection (looks like it, as they all return the same address)?
Yes, they all return the same IP address.
Roan Kattouw (Catrope)
Hi,
Thanks, I've coded it (very easily, somewhat less code than there was, "urllib2" just makes things more complicated than the httplib I/F ;-)
Does the API put *really* want the text in the encoded URL? It does work ... but isn't it sometimes kinda long?
Robert
On Tue, Jun 1, 2010 at 6:18 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
2010/6/1 Robert Ullmann rlullmann@gmail.com:
Hi,
I noted that MWclient uses, or tries to use, persistent HTTP connections to the API.
I have a few questions, as I am looking at improving Interwicket, which makes lots of API calls, and is sometimes driven nuts by the two layers of proxies (here, on my side of the net) that are prone to failing ...
Is using persistent connections a good idea?
For requests made in rapid succession, definitely. For requests that are further apart, maybe.
What is the server-side timeout when a connection is idle?
No idea. You could find out by trying :)
Can I make requests to various different projects (xx.wiktionary, for various xx) on the same connection (looks like it, as they all return the same address)?
Yes, they all return the same IP address.
Roan Kattouw (Catrope)
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
2010/6/1 Robert Ullmann rlullmann@gmail.com:
Hi,
Thanks, I've coded it (very easily, somewhat less code than there was, "urllib2" just makes things more complicated than the httplib I/F ;-)
Does the API put *really* want the text in the encoded URL? It does work ... but isn't it sometimes kinda long?
You can use POST requests for all API modules.
Roan Kattouw (Catrope)
Hi,
That's nice. Would be even nicer if the documentation mentioned that somewhere (;-)
What it does say is "The API takes its input through parameters in the query string." Without a hint of using posted "form" data. Not even in the section on editing, which gives an example with the text in the query string.
I thought it was pretty odd, but hey, that's what it *says* !
I did find a very serious bug. At present my network connection is troublesome, often returning 0 data, or truncated data. (This makes it a good test environment ;-). Take a look at this edit:
http://en.wiktionary.org/w/index.php?title=User%3AInterwicket%2FFL_status&am...
(and preceding and following if you like) The text was truncated and the page updated anyway. The server isn't noticing that the read data doesn't add up to the content-length. (And there isn't any particular terminator, like </api> when reading things.) I think this is a serious bug?
I've protected against it by encoding the token *after* the text, title, and edit summary. Presumably no token means it won't edit?
with my best regards, Robert
On Tue, Jun 1, 2010 at 6:26 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
2010/6/1 Robert Ullmann rlullmann@gmail.com:
Hi,
Thanks, I've coded it (very easily, somewhat less code than there was, "urllib2" just makes things more complicated than the httplib I/F ;-)
Does the API put *really* want the text in the encoded URL? It does work ... but isn't it sometimes kinda long?
You can use POST requests for all API modules.
Roan Kattouw (Catrope)
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
On Wed, Jun 2, 2010 at 2:29 PM, Robert Ullmann rlullmann@gmail.com wrote:
http://en.wiktionary.org/w/index.php?title=User%3AInterwicket%2FFL_status&am...
(and preceding and following if you like) The text was truncated and the page updated anyway. The server isn't noticing that the read data doesn't add up to the content-length. (And there isn't any particular terminator, like </api> when reading things.) I think this is a serious bug?
I've protected against it by encoding the token *after* the text, title, and edit summary. Presumably no token means it won't edit?
Yes, that is the proper way to protect against broken networks. It should indeed be noted on the manual page.
Bryan
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10-06-02 10:58 AM, Bryan Tong Minh wrote:
Yes, that is the proper way to protect against broken networks.
Isn't sending a hash of the text supposed to be the preferred way of detecting corruption?
- -Mike
On Wed, Jun 02, 2010 at 03:29:25PM +0300, Robert Ullmann wrote:
I've protected against it by encoding the token *after* the text, title, and edit summary. Presumably no token means it won't edit?
I believe the standard HTML edit form places the hidden fields at the end for the same reason. You could also place the "action" at the end, so if truncated the API request will fail for having no action specified.
With the API, you could also use the "md5" option to action=edit (this can also help catch certain charset encoding issues). Or, theoretically, using multipart/form-data instead of application/x-www-form-urlencoded for your POST would cause your post to be malformed if truncated, but I wouldn't be extremely surprised if PHP "fixes" that.
On Wed, Jun 2, 2010 at 4:27 PM, Brad Jorsch b-jorsch@northwestern.edu wrote:
Or, theoretically, using multipart/form-data instead of application/x-www-form-urlencoded for your POST would cause your post to be malformed if truncated, but I wouldn't be extremely surprised if PHP "fixes" that.
It does.
I do, in fact, get badtoken errors occasionally, when the network request gets mangled somewhere.
I also got this in my log:
(mwapi putedit error: u'<?xml version="1.0"?><api><error code="unknownerror" info="Unknown error: ``231''" /></api>')
(that is Python repr() format of the returned text)
I've only noted this once; there was no particular difference in the request being made (sorry, I don't have that; I *think* the request was to te.wikt); same code path as many thousands of other page writes.
I'm not really concerned about it, but it might bear looking at?
Robert
On Wed, Jun 2, 2010 at 3:29 PM, Robert Ullmann rlullmann@gmail.com wrote:
I've protected against it by encoding the token *after* the text, title, and edit summary. Presumably no token means it won't edit?
with my best regards, Robert
On Tue, Jun 1, 2010 at 5:24 PM, Robert Ullmann rlullmann@gmail.com wrote:
Does the API put *really* want the text in the encoded URL? It does work ... but isn't it sometimes kinda long?
For that reason mwclient always makes post requests in application/x-www-form-urlencoded :)
Bryan
mediawiki-api@lists.wikimedia.org