Reading the wikipedia html output, I have found that EditPage.php produce "+" has the value for wpEditToken. This token seens supposedly random, to stop spammers to fill the wikipedia with viagra links. But It don't seems much random to me, on all computers I have tested, it seems constant to "+"
Is that a code bug, or maybe misconfiguration on the wikipedia guys?.
Un saludo.
Tei escribió:
Reading the wikipedia html output, I have found that EditPage.php produce "+" has the value for wpEditToken. This token seens supposedly random, to stop spammers to fill the wikipedia with viagra links. But It don't seems much random to me, on all computers I have tested, it seems constant to "+"
Is that a code bug, or maybe misconfiguration on the wikipedia guys?.
Un saludo.
It is always the same for anonymous users. It does change for logged in users, which prevent a third party site to edit by redirecting a logged-in user. Additionally, bots mishandling those characters will be unable to post.
There was recently another thread about $wpEditToken
On Thu, Sep 25, 2008 at 4:39 AM, Tei oscar.vives@gmail.com wrote:
Reading the wikipedia html output, I have found that EditPage.php produce "+" has the value for wpEditToken. This token seens supposedly random, to stop spammers to fill the wikipedia with viagra links. But It don't seems much random to me, on all computers I have tested, it seems constant to "+"
Is that a code bug, or maybe misconfiguration on the wikipedia guys?.
My recollection is that it was a way to detect edits that were passing through certain broken proxies, which would silently corrupt the edit form data. By adding some content to the edit token that these proxies would corrupt as well, the edits would be rejected, while others would be unaffected. Apparently "+" will trigger this particular bug in these particular proxies, so it will prevent randomly screwing up pages in some cases. The source code/revision log should have more info.
On Fri, Sep 26, 2008 at 1:50 AM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Thu, Sep 25, 2008 at 4:39 AM, Tei oscar.vives@gmail.com wrote:
Reading the wikipedia html output, I have found that EditPage.php produce "+" has the value for wpEditToken. This token seens supposedly random, to stop spammers to fill the wikipedia with viagra links. But It don't seems much random to me, on all computers I have tested, it seems constant to "+"
Is that a code bug, or maybe misconfiguration on the wikipedia guys?.
My recollection is that it was a way to detect edits that were passing through certain broken proxies, which would silently corrupt the edit form data. By adding some content to the edit token that these proxies would corrupt as well, the edits would be rejected, while others would be unaffected. Apparently "+" will trigger this particular bug in these particular proxies, so it will prevent randomly screwing up pages in some cases. The source code/revision log should have more info.
so... what stops a maliciuous banner script to insert viagra links on random wikipedia articles?.
other than 2 unixtimes, and the md5 of summary, I don't see how this is protected at all.
On Fri, Sep 26, 2008 at 12:04 PM, Tei oscar.vives@gmail.com wrote:
so... what stops a maliciuous banner script to insert viagra links on random wikipedia articles?.
Nothing except the external link filter, the captcha, and a lot of editors ready to revert them.
other than 2 unixtimes, and the md5 of summary, I don't see how this is protected at all.
For anon users, the edit token exists to ensure integrity of the submission, i.e., that it was submitted correctly and as intended. For logged-in users, it also makes impersonation more difficult. It is not meant to prevent incorrect submissions, which is a much higher-level job.
On Fri, Sep 26, 2008 at 6:09 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Fri, Sep 26, 2008 at 12:04 PM, Tei oscar.vives@gmail.com wrote:
so... what stops a maliciuous banner script to insert viagra links on random wikipedia articles?.
Nothing except the external link filter, the captcha, and a lot of editors ready to revert them.
the captcha is show if the external link fail? humm.. could the external link emulated with ajax?
//Say Hello Mom! ajax.setRequestHeader("Referer", "http://en.wikipedia.org/w/index.php?title=Effect&action=edit");
On Fri, Sep 26, 2008 at 12:43 PM, Tei oscar.vives@gmail.com wrote:
the captcha is show if the external link fail?
As far as I recall, the captcha is triggered if any new external link is added. But some links are blacklisted and can never be added regardless.
humm.. could the external link emulated with ajax?
//Say Hello Mom! ajax.setRequestHeader("Referer", "http://en.wikipedia.org/w/index.php?title=Effect&action=edit");
Ajax doesn't work across domains. If a rogue sysop were to add malicious JS to MediaWiki:Common.js, then sure, you could probably do fun stuff like forge edits somehow.
Tei wrote:
so... what stops a maliciuous banner script to insert viagra links on random wikipedia articles?.
other than 2 unixtimes, and the md5 of summary, I don't see how this is protected at all.
It doesn't stop (and the md5 is not needed). An external page could produce a form to make their users post data to wikipedia. But a) The target pages would be protected, content blocked... b) The users may discover it. c) It won't work with logged-in users.
Aryeh Gregor wrote:
On Thu, Sep 25, 2008 at 4:39 AM, Tei oscar.vives@gmail.com wrote:
Reading the wikipedia html output, I have found that EditPage.php produce "+" has the value for wpEditToken. This token seens supposedly random, to stop spammers to fill the wikipedia with viagra links. But It don't seems much random to me, on all computers I have tested, it seems constant to "+"
Is that a code bug, or maybe misconfiguration on the wikipedia guys?.
My recollection is that it was a way to detect edits that were passing through certain broken proxies, which would silently corrupt the edit form data. By adding some content to the edit token that these proxies would corrupt as well, the edits would be rejected, while others would be unaffected. Apparently "+" will trigger this particular bug in these particular proxies, so it will prevent randomly screwing up pages in some cases. The source code/revision log should have more info.
Yes, it's a kluge I added sometime last year, I think. The problem is that there's a huge install base of broken "PHP proxy" scripts that essentially pass all content through the PHP addslashes() function (or, rather, they have magic_quotes_gpc enabled and don't use stripslashes()).
Trying to edit through such a proxy would, in particular, turn all apostrophes in the page text into "'" or even "\'". We used to get such edits with some regularity. Having a backslash in the edit token prevents editing via such proxies, since they will mangle it in the same way. As a nice bonus, it also happens to prevent some widespread spam- and vandalbots from using those proxies to hide their trails.
(The reason I didn't include an actual apostrophe in the edit token was that, at least at the time, in some parts of the code the edit token was being embedded in hardcoded HTML without proper escaping, at least in some cases with single quotes around it. I didn't feel like tracking down and fixing all cases of that at the time (though it certainly should be done if someone hasn't already), so I just used a backslash, which breaks in the same way but has no special meaning in HTML.)
I'm not sure why the "+" is there; I think it protects against another type of broken proxy that turns plus signs into spaces, presumably due to improper URL-decoding of form values.
It's indeed a hack, but it does work remarkably well for all its simplicity. It'd probably deserve to be documented better, though, so that others won't have to ask the same question as you did.
Ilmari Karonen wrote:
I'm not sure why the "+" is there; I think it protects against another type of broken proxy that turns plus signs into spaces, presumably due to improper URL-decoding of form values.
It was added by Tim Starling ('Added + to EDIT_TOKEN_SUFFIX on report of broken proxy from mutante') http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/User.php?r1=...
Your addition of \ to the edit token: http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/User.php?vie...
Ilmari Karonen schreef:
It's indeed a hack, but it does work remarkably well for all its simplicity. It'd probably deserve to be documented better, though, so that others won't have to ask the same question as you did.
The entire edit token system, including the +\ suffix, is documented at MW.org [1].
Roan Kattouw (Catrope)
On Sat, Sep 27, 2008 at 01:37, Ilmari Karonen nospam@vyznev.net wrote:
Trying to edit through such a proxy would, in particular, turn all apostrophes in the page text into "'" or even "\'". We used to get such edits with some regularity. Having a backslash in the edit token prevents editing via such proxies, since they will mangle it in the same way. As a nice bonus, it also happens to prevent some widespread spam- and vandalbots from using those proxies to hide their trails.
Could there be also some Unicode, such as a long dash (—; replaced to "-" by some proxies) and some characters from IPA (say, ɾ̃ɚ)? I sometimes see users that corrupt such characters, and finding and reverting them is a pain.
— Kalan
Kalan schreef:
On Sat, Sep 27, 2008 at 01:37, Ilmari Karonen nospam@vyznev.net wrote:
Trying to edit through such a proxy would, in particular, turn all apostrophes in the page text into "'" or even "\'". We used to get such edits with some regularity. Having a backslash in the edit token prevents editing via such proxies, since they will mangle it in the same way. As a nice bonus, it also happens to prevent some widespread spam- and vandalbots from using those proxies to hide their trails.
Could there be also some Unicode, such as a long dash (—; replaced to "-" by some proxies) and some characters from IPA (say, ɾ̃ɚ)?
You mean in the edit token? There currently aren't, it's just a hexadecimal number (0-9 and a-f) with + at the end.
I sometimes see users that corrupt such characters, and finding and reverting them is a pain.
I'm personally wary of abusing the edit token too much for this purpose. \ and + are there because clients mangling those characters screw up big time, whereas Unicode/IPA damage is fairly limited (unless you're editing in a language that uses Unicode characters extensively, such as German).
Roan Kattouw (Catrope)
On Wed, Oct 8, 2008 at 1:31 PM, Roan Kattouw roan.kattouw@home.nl wrote:
I'm personally wary of abusing the edit token too much for this purpose. \ and + are there because clients mangling those characters screw up big time, whereas Unicode/IPA damage is fairly limited (unless you're editing in a language that uses Unicode characters extensively, such as German).
. . . or any other language than English. It's true that we needn't block such people if they aren't going to screw anything up, but in practice that's only likely for English, and most popular pages on enwiki probably contain at least one Unicode character too.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Aryeh Gregor wrote:
On Wed, Oct 8, 2008 at 1:31 PM, Roan Kattouw roan.kattouw@home.nl wrote:
I'm personally wary of abusing the edit token too much for this purpose. \ and + are there because clients mangling those characters screw up big time, whereas Unicode/IPA damage is fairly limited (unless you're editing in a language that uses Unicode characters extensively, such as German).
. . . or any other language than English.
... or any English-language page that has interlanguage links... :)
- -- brion
On Wed, Oct 8, 2008 at 6:44 PM, Kalan kalan.001@gmail.com wrote:
On Sat, Sep 27, 2008 at 01:37, Ilmari Karonen nospam@vyznev.net wrote:
Trying to edit through such a proxy would, in particular, turn all apostrophes in the page text into "'" or even "\'". We used to get such edits with some regularity. Having a backslash in the edit token prevents editing via such proxies, since they will mangle it in the same way. As a nice bonus, it also happens to prevent some widespread spam- and vandalbots from using those proxies to hide their trails.
Could there be also some Unicode, such as a long dash (—; replaced to "-" by some proxies) and some characters from IPA (say, ɾ̃ɚ)? I sometimes see users that corrupt such characters, and finding and reverting them is a pain.
Ouch!... I suggest to contact the authors of that broken proxy software and report this.
Even if mediawiki is shielded against that problems, others will suffer for this.
-- ℱin del ℳensaje.
On Thu, Oct 9, 2008 at 4:01 AM, Tei oscar.vives@gmail.com wrote:
Ouch!... I suggest to contact the authors of that broken proxy software and report this.
Even if mediawiki is shielded against that problems, others will suffer for this.
If editing Wikipedia doesn't work, particularly if it fails with an informative error message instead of "edit token didn't match", maybe the users will report it themselves.
wikitech-l@lists.wikimedia.org