I've just had a colleague send me links to a couple of English Wikipedia articles that were displaying as complete garbage - it looked like corrupt character encoding or something (there was no UI - just a page full of random characters and boxes). Running ?action=purge on them sorted it out, but if he hit upon two corrupted pages in a few minutes, there are probably more.
Does anyone know anything about it?
On Wed, 2013-02-20 at 12:08 +0000, Thomas Dalton wrote:
I've just had a colleague send me links to a couple of English Wikipedia articles that were displaying as complete garbage - it looked like corrupt character encoding or something (there was no UI - just a page full of random characters and boxes). Running ?action=purge on them sorted it out, but if he hit upon two corrupted pages in a few minutes, there are probably more.
Does anyone know anything about it?
Not without a testcase (URL) to start investigating. :)
andre
On 20 February 2013 12:11, Andre Klapper aklapper@wikimedia.org wrote:
On Wed, 2013-02-20 at 12:08 +0000, Thomas Dalton wrote:
I've just had a colleague send me links to a couple of English Wikipedia articles that were displaying as complete garbage - it looked like corrupt character encoding or something (there was no UI - just a page full of random characters and boxes). Running ?action=purge on them sorted it out, but if he hit upon two corrupted pages in a few minutes, there are probably more.
Does anyone know anything about it?
Not without a testcase (URL) to start investigating. :)
I've fixed the ones I know about, so I don't know if they'll be much help (which is why I didn't specify them before). If it does help, they were:
http://en.wikipedia.org/wiki/Neil_Clark_Warren and http://en.wikipedia.org/wiki/Pepper_Schwartz
(You can draw your own conclusions from my colleague's office reading habits!)
On 20/02/13 23:30, Thomas Dalton wrote:
On 20 February 2013 12:11, Andre Klapper aklapper@wikimedia.org wrote:
On Wed, 2013-02-20 at 12:08 +0000, Thomas Dalton wrote:
I've just had a colleague send me links to a couple of English Wikipedia articles that were displaying as complete garbage - it looked like corrupt character encoding or something (there was no UI - just a page full of random characters and boxes). Running ?action=purge on them sorted it out, but if he hit upon two corrupted pages in a few minutes, there are probably more.
Does anyone know anything about it?
Not without a testcase (URL) to start investigating. :)
I've fixed the ones I know about, so I don't know if they'll be much help (which is why I didn't specify them before). If it does help, they were:
http://en.wikipedia.org/wiki/Neil_Clark_Warren and http://en.wikipedia.org/wiki/Pepper_Schwartz
(You can draw your own conclusions from my colleague's office reading habits!)
It's not a test case after you've run action=purge on it. If you want to report things like this, it's best if you don't run action=purge, or even report it to anyone who might be inclined to do such a thing. Cache-related test cases are very fragile, so it takes some care to get them to a developer intact.
In the past, there have been problems with gzipped output being served without a Content-Encoding header, due to subtle Squid vary header bugs. But it's hard to tell if that's what happened here, just from your description.
-- Tim Starling
On 20 February 2013 12:43, Tim Starling tstarling@wikimedia.org wrote:
It's not a test case after you've run action=purge on it.
Which is why I didn't bother including the URLs in the initial report.
If you want to report things like this, it's best if you don't run action=purge, or even report it to anyone who might be inclined to do such a thing. Cache-related test cases are very fragile, so it takes some care to get them to a developer intact.
My top priority was helping the person that reported it to read the page they wanted to read.
A little gratitude to someone trying to help you fix a problem wouldn't go amiss...
Brian, would you take a look at https://www.mediawiki.org/wiki/How_to_report_a_bug and maybe update it to clarify what sorts of information to try to hold on to for debugging purposes?
On 02/20/2013 07:50 AM, Thomas Dalton wrote:
On 20 February 2013 12:43, Tim Starling tstarling@wikimedia.org wrote:
It's not a test case after you've run action=purge on it.
Which is why I didn't bother including the URLs in the initial report.
If you want to report things like this, it's best if you don't run action=purge, or even report it to anyone who might be inclined to do such a thing. Cache-related test cases are very fragile, so it takes some care to get them to a developer intact.
My top priority was helping the person that reported it to read the page they wanted to read.
A little gratitude to someone trying to help you fix a problem wouldn't go amiss...
Thomas, thanks for the bug report. Sorry for the mixed messages here. If you run across the problem again and report it to us before helping your colleague, you can tell him I told you to do it, and blame me!
And I have now learned something about two very different relationship experts, thanks to those URLs. :-)
On Wed, Feb 20, 2013 at 9:59 AM, Sumana Harihareswara sumanah@wikimedia.org wrote:
Brian, would you take a look at https://www.mediawiki.org/wiki/How_to_report_a_bug and maybe update it to clarify what sorts of information to try to hold on to for debugging purposes?
What sort of debugging information is useful depends on the situation. In most cases the type of information I mentioned would be overkill.
A little gratitude to someone trying to help you fix a problem wouldn't go amiss...
We appreciate the bug report, we just can't do anything about it without more information. To give a (not entirely fair) comparison, imagine someone posted on your talk page that there was a spelling error on Wikipedia. I assume you would respond to such a report with "where?", it wouldn't be because you're ungrateful that you respond like that, but simply that you cannot fix the issue without more information (Wikipedia is a big place). The situation here is somewhat similar. We're grateful for the report, but would need more information before we can do anything about it.
--bawolff
On 20 February 2013 16:32, bawolff bawolff+wn@gmail.com wrote:
A little gratitude to someone trying to help you fix a problem wouldn't go amiss...
We appreciate the bug report, we just can't do anything about it without more information. To give a (not entirely fair) comparison, imagine someone posted on your talk page that there was a spelling error on Wikipedia. I assume you would respond to such a report with "where?", it wouldn't be because you're ungrateful that you respond like that, but simply that you cannot fix the issue without more information (Wikipedia is a big place). The situation here is somewhat similar. We're grateful for the report, but would need more information before we can do anything about it.
My actual question was "Does anyone know anything about it?" - I was trying to determine if this was a known problem, which would help determine the next step. I think I supplied enough information for that purpose.
Sumana Harihareswara wrote:
Brian, would you take a look at https://www.mediawiki.org/wiki/How_to_report_a_bug and maybe update it to clarify what sorts of information to try to hold on to for debugging purposes?
More to the point (this is a list for developers, after all), if you're a developer and you find a bug (or a potential bug!), please report it to https://bugzilla.wikimedia.org (aliased as https://bugs.wikimedia.org or https://bugs.mediawiki.org). :-)
This list (wikitech-l) is fine for discussion of bugs (or possible bugs), but it _really helps_ when the issue can be properly tracked in Bugzilla. Mailing lists are ephemeral. Bugzilla is forever.
MZMcBride
On Wed, 2013-02-20 at 08:59 -0500, Sumana Harihareswara wrote:
Brian, would you take a look at https://www.mediawiki.org/wiki/How_to_report_a_bug and maybe update it to clarify what sorts of information to try to hold on to for debugging purposes?
I'd like to keep How_to_report_a_bug a generic, high-level page. Case-specific debug information is very welcome on a separate page though (similar to http://www.mediawiki.org/wiki/Manual:How_to_debug for MediaWiki), and could be linked to from How_to_report_a_bug.
andre
Never on wikipedia (that ive heard of) but on third party wikis - a misconfiguration causing double encoding with gzip causes symptoms very similar to what you describe.
But as andre said, without an example is very difficult to say anything. If someone came across something like that again, please include as much detail as possible - save a copy of the broken page, save what http headers you got with the request (if you know how), mention if you were logged in or not, check to see if the page is always served broken or if it was just a one time thing, etc. You never know which detail might be important.
-bawolff On 2013-02-20 8:11 AM, "Andre Klapper" aklapper@wikimedia.org wrote:
On Wed, 2013-02-20 at 12:08 +0000, Thomas Dalton wrote:
I've just had a colleague send me links to a couple of English Wikipedia articles that were displaying as complete garbage - it looked like corrupt character encoding or something (there was no UI - just a page full of random characters and boxes). Running ?action=purge on them sorted it out, but if he hit upon two corrupted pages in a few minutes, there are probably more.
Does anyone know anything about it?
Not without a testcase (URL) to start investigating. :)
andre
Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org