Hi,
Is there some effective way to do this? We are using only mw api's in latest huggle, and somehow it happens that when users are logged out of mediawiki, it still works (edits are done using IP instead).
How can I ensure that api query will fail unless user is not logged in, is there some variable for that? Huggle is executing huge number of api queries in multiple threads so checking if user is logged in before every single query would be too slow.
Look up the assert parameter in the API On Jun 19, 2014 6:52 AM, "Petr Bena" benapetr@gmail.com wrote:
Hi,
Is there some effective way to do this? We are using only mw api's in latest huggle, and somehow it happens that when users are logged out of mediawiki, it still works (edits are done using IP instead).
How can I ensure that api query will fail unless user is not logged in, is there some variable for that? Huggle is executing huge number of api queries in multiple threads so checking if user is logged in before every single query would be too slow.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Nice.
Can this parameter be anywhere in the url? for example api.php?action=query&assert=user&prop=blabla or does it need to be on a specific position, like token?
On Thu, Jun 19, 2014 at 12:54 PM, John phoenixoverride@gmail.com wrote:
Look up the assert parameter in the API On Jun 19, 2014 6:52 AM, "Petr Bena" benapetr@gmail.com wrote:
Hi,
Is there some effective way to do this? We are using only mw api's in latest huggle, and somehow it happens that when users are logged out of mediawiki, it still works (edits are done using IP instead).
How can I ensure that api query will fail unless user is not logged in, is there some variable for that? Huggle is executing huge number of api queries in multiple threads so checking if user is logged in before every single query would be too slow.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I think appending it to the end of a query is the common practice
On Thu, Jun 19, 2014 at 7:06 AM, Petr Bena benapetr@gmail.com wrote:
Nice.
Can this parameter be anywhere in the url? for example api.php?action=query&assert=user&prop=blabla or does it need to be on a specific position, like token?
On Thu, Jun 19, 2014 at 12:54 PM, John phoenixoverride@gmail.com wrote:
Look up the assert parameter in the API On Jun 19, 2014 6:52 AM, "Petr Bena" benapetr@gmail.com wrote:
Hi,
Is there some effective way to do this? We are using only mw api's in latest huggle, and somehow it happens that when users are logged out of mediawiki, it still works (edits are done using IP instead).
How can I ensure that api query will fail unless user is not logged in, is there some variable for that? Huggle is executing huge number of api queries in multiple threads so checking if user is logged in before every single query would be too slow.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Petr Bena wrote:
Can this parameter be anywhere in the url? for example api.php?action=query&assert=user&prop=blabla or does it need to be on a specific position, like token?
It can be anywhere after "api.php", but not anywhere in the URL.
I don't believe any token requires a specific position in a URL.
MZMcBride
On Thu, 19 Jun 2014 15:16:22 +0200, MZMcBride z@mzmcbride.com wrote:
Petr Bena wrote:
Can this parameter be anywhere in the url? for example api.php?action=query&assert=user&prop=blabla or does it need to be on a specific position, like token?
It can be anywhere after "api.php", but not anywhere in the URL. I don't believe any token requires a specific position in a URL.
Yeah… The API documentation mentions this, but I think it's wrong or at least misleading.
https://www.mediawiki.org/wiki/API:Edit#Token
"When passing this to the Edit API, always pass the token parameter last (or at least after the text parameter). That way, if the edit gets interrupted, the token won't be passed and the edit will fail."
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
Blame Reedy [1]. Or ask him for clarification...
--HM
[1] https://www.mediawiki.org/w/index.php?title=API:Edit&diff=410992&old...
On 19 June 2014 14:27, Bartosz Dziewoński matma.rex@gmail.com wrote:
On Thu, 19 Jun 2014 15:16:22 +0200, MZMcBride z@mzmcbride.com wrote:
Petr Bena wrote:
Can this parameter be anywhere in the url? for example api.php?action=query&assert=user&prop=blabla or does it need to be on a specific position, like token?
It can be anywhere after "api.php", but not anywhere in the URL. I don't believe any token requires a specific position in a URL.
Yeah… The API documentation mentions this, but I think it's wrong or at least misleading.
https://www.mediawiki.org/wiki/API:Edit#Token
"When passing this to the Edit API, always pass the token parameter last (or at least after the text parameter). That way, if the edit gets interrupted, the token won't be passed and the edit will fail."
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
-- Matma Rex
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
At some point it makes sense. You shouldn't rely on servers :P I am placing token as last per this suggestion.
On Thu, Jun 19, 2014 at 5:42 PM, Happy Melon happy.melon.wiki@gmail.com wrote:
Blame Reedy [1]. Or ask him for clarification...
--HM
[1] https://www.mediawiki.org/w/index.php?title=API:Edit&diff=410992&old...
On 19 June 2014 14:27, Bartosz Dziewoński matma.rex@gmail.com wrote:
On Thu, 19 Jun 2014 15:16:22 +0200, MZMcBride z@mzmcbride.com wrote:
Petr Bena wrote:
Can this parameter be anywhere in the url? for example api.php?action=query&assert=user&prop=blabla or does it need to be on a specific position, like token?
It can be anywhere after "api.php", but not anywhere in the URL. I don't believe any token requires a specific position in a URL.
Yeah… The API documentation mentions this, but I think it's wrong or at least misleading.
https://www.mediawiki.org/wiki/API:Edit#Token
"When passing this to the Edit API, always pass the token parameter last (or at least after the text parameter). That way, if the edit gets interrupted, the token won't be passed and the edit will fail."
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
-- Matma Rex
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
In general I would extend this: "you should never rely on other programmers assuming they did things correctly because we are lazy morons"
On Thu, Jun 19, 2014 at 5:48 PM, Petr Bena benapetr@gmail.com wrote:
At some point it makes sense. You shouldn't rely on servers :P I am placing token as last per this suggestion.
On Thu, Jun 19, 2014 at 5:42 PM, Happy Melon happy.melon.wiki@gmail.com wrote:
Blame Reedy [1]. Or ask him for clarification...
--HM
[1] https://www.mediawiki.org/w/index.php?title=API:Edit&diff=410992&old...
On 19 June 2014 14:27, Bartosz Dziewoński matma.rex@gmail.com wrote:
On Thu, 19 Jun 2014 15:16:22 +0200, MZMcBride z@mzmcbride.com wrote:
Petr Bena wrote:
Can this parameter be anywhere in the url? for example api.php?action=query&assert=user&prop=blabla or does it need to be on a specific position, like token?
It can be anywhere after "api.php", but not anywhere in the URL. I don't believe any token requires a specific position in a URL.
Yeah… The API documentation mentions this, but I think it's wrong or at least misleading.
https://www.mediawiki.org/wiki/API:Edit#Token
"When passing this to the Edit API, always pass the token parameter last (or at least after the text parameter). That way, if the edit gets interrupted, the token won't be passed and the edit will fail."
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
-- Matma Rex
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Jun 19, 2014 at 6:27 AM, Bartosz Dziewoński matma.rex@gmail.com wrote:
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
Actually not. multipart/form-data POST requests have an end marker, but application/x-www-form-urlencoded requests have not - they use the same param1=foo¶m2=bar format GET URLs do, there is no way to tell if that is cut off. Lower-level protocols will deal with issues like lost packets or network disconnection, but if the body of the request is truncated because of an error in the sending HTTP library, like using a buffer that is too small, there is no way the server could detect that.
Gergo Tisza wrote:
On Thu, Jun 19, 2014 at 6:27 AM, Bartosz Dziewoński matma.rex@gmail.com wrote:
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
Actually not. multipart/form-data POST requests have an end marker, but application/x-www-form-urlencoded requests have not - they use the same param1=foo¶m2=bar format GET URLs do, there is no way to tell if that is cut off. Lower-level protocols will deal with issues like lost packets or network disconnection, but if the body of the request is truncated because of an error in the sending HTTP library, like using a buffer that is too small, there is no way the server could detect that.
Thanks for sharing this. It's interesting to read.
Though I believe the server, or at least MediaWiki's application logic on the server, does indeed provide a means of detecting a truncated parameter value:
https://www.mediawiki.org/w/api.php?action=help&modules=edit
-- md5 - The MD5 hash of the text parameter, or the prependtext and appendtext parameters concatenated. If set, the edit won't be done unless the hash is correct --
I suppose you'd need to make sure the "md5" parameter made it to the Web server before the truncated text parameter... bah. :-)
MZMcBride
On 6/19/14, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Jun 19, 2014 at 6:27 AM, Bartosz Dziewoński matma.rex@gmail.com wrote:
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
Actually not. multipart/form-data POST requests have an end marker, but application/x-www-form-urlencoded requests have not - they use the same param1=foo¶m2=bar format GET URLs do, there is no way to tell if that is cut off. Lower-level protocols will deal with issues like lost packets or network disconnection, but if the body of the request is truncated because of an error in the sending HTTP library, like using a buffer that is too small, there is no way the server could detect that. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
What about the content-length header? I believe that's included with POST requests even when using application/x-www-form-urlencoded form.
Although I have noticed we do have code in EditPage.php to detect this situation for normal edits, so I guess it must happen on occasion.
--bawolff
On 2014-06-19, 6:23 PM, Brian Wolff wrote:
On 6/19/14, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Jun 19, 2014 at 6:27 AM, Bartosz Dziewoński matma.rex@gmail.com wrote:
I'm reasonably sure that the HTTP and HTTPS protocols are smart enough to recognize "cut off" requests, and that any servers whatsoever are smart enough to implement this behavior.
Actually not. multipart/form-data POST requests have an end marker, but application/x-www-form-urlencoded requests have not - they use the same param1=foo¶m2=bar format GET URLs do, there is no way to tell if that is cut off. Lower-level protocols will deal with issues like lost packets or network disconnection, but if the body of the request is truncated because of an error in the sending HTTP library, like using a buffer that is too small, there is no way the server could detect that. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
What about the content-length header? I believe that's included with POST requests even when using application/x-www-form-urlencoded form.
Although I have noticed we do have code in EditPage.php to detect this situation for normal edits, so I guess it must happen on occasion.
--bawolff
Our EditPage situation is because it's possible for a form to be submitted while the HTML is still downloading and some of the form elements aren't in the DOM yet, especially when something like a textarea has a huge amount of wikitext in it.
For what Bartosz is talking about, I think he's talking about a case where someone or a library codes a HTTP POST response body by urlencoding text and then appending it to some form of buffer. If the buffer has a max length to it and truncates text instead of throwing an error when appending to it, then it's used to serve the POST. Since it's the buffer that's wrong, the Content-Length would be based on the incorrect buffer length, and since urlencoded form text has no end markers or number indicating the length of an individual param a large body of text could be cut off mid way by the buffer end and it wouldn't be invalid.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]
On Thu, Jun 19, 2014 at 9:23 PM, Brian Wolff bawolff@gmail.com wrote:
What about the content-length header? I believe that's included with POST requests even when using application/x-www-form-urlencoded form.
I suggest people in this thread read section 4.4 of RFC 2616.
There is no situation in which an HTTP request can be "cut off" such that part of the query goes missing. Assuming you are using a compliant web server, the Content-Length header is basically required for request bodies (unless you're doing chunked encoding or something).
Try right now submitting a POST request without a Content-Length header to Wikipedia. It results in a 400. Additionally, if you send an incorrect header, it will either wait until timeout if it's too long or give 400 if it's too short.
Like Daniel said, the only situation that is possible is some sort of buffer issue, where there is a client-side application logic error that causes the incorrect request to be sent. But this is not really MediaWiki's problem, and client's should be using the Content-MD5 header (or whatever the bastardized MediaWiki version is, since MW doesn't actually support HTTP) if they really want to ensure data integrity.
*-- * *Tyler Romeo* Stevens Institute of Technology, Class of 2016 Major in Computer Science
On 2014-06-19, 7:55 PM, Tyler Romeo wrote:
On Thu, Jun 19, 2014 at 9:23 PM, Brian Wolff bawolff@gmail.com wrote:
What about the content-length header? I believe that's included with POST requests even when using application/x-www-form-urlencoded form.
I suggest people in this thread read section 4.4 of RFC 2616.
There is no situation in which an HTTP request can be "cut off" such that part of the query goes missing. Assuming you are using a compliant web server, the Content-Length header is basically required for request bodies (unless you're doing chunked encoding or something).
Try right now submitting a POST request without a Content-Length header to Wikipedia. It results in a 400. Additionally, if you send an incorrect header, it will either wait until timeout if it's too long or give 400 if it's too short.
Like Daniel said, the only situation that is possible is some sort of buffer issue, where there is a client-side application logic error that causes the incorrect request to be sent. But this is not really MediaWiki's problem, and client's should be using the Content-MD5 header (or whatever the bastardized MediaWiki version is, since MW doesn't actually support HTTP) if they really want to ensure data integrity.
A) Issues like this result in a simple content tweak to end up erasing the latter part of an article, it may not be a problem "in" MediaWiki, but it still something that affects us and something for us to document to API users. B) If we even have anything similar to Content-MD5 (I don't remember seeing anything) it's optional and the clients most likely to have a buffer bugs like that are going to be the ones that don't use it. C) Even if a client used something like Content-MD5 it wouldn't really help, if a client has a buffer issue like we're talking about then there's a good chance that the corruption exists in the same buffer that the hash is based on so the hash itself will be incorrect.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]
On 06/19/2014 11:09 PM, Daniel Friesen wrote:
B) If we even have anything similar to Content-MD5 (I don't remember seeing anything) it's optional and the clients most likely to have a buffer bugs like that are going to be the ones that don't use it.
There is an optional md5 parameter in the API; see https://www.mediawiki.org/wiki/API:Edit#Parameters
Matt Flaschen
wikitech-l@lists.wikimedia.org