Our site uses the ConfirmAccount extension, which allows registration requests to include a CV (supplying one is a pre-requisite for becoming an editor on our site - normal users are registered as authors). I have identified a problem - the response to the download request for the CV contains a content-encoding header value of ", gzip". This befuddles some browsers, such as FF, IE and Safari and they fail to decompress the file.
My guess is this is an apache configuration problem or an apache bug, but on the off-chance my guess is wrong and it has something to do with MW configuration, I thought I would bring the issue up here. If this is not an MW problem, then confirmation of that will narrow down the search space for a solution.
Here is the request and response headers (I have obscured some file and cookie information using ellipses to ensure privacy - the output is from HTTPFox running on Firefox):
(Request-Line) GET /wiki?title=Special:ConfirmAccounts/ editors&file=to...v.pdf HTTP/1.1 Host en.citizendium.org User-Agent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language en-us,en;q=0.5 Accept-Encoding gzip,deflate Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive 115 Connection keep-alive Referer http://en.citizendium.org/wiki?title=Special:ConfirmAccounts/ editors&acrid=... Cookie __utma=184949757....
(Status-Line) HTTP/1.0 200 OK Date Wed, 09 Mar 2011 01:27:46 GMT Server Apache/2.2.3 (CentOS) X-Powered-By PHP/5.3.5 Expires Thu, 01 Jan 1970 00:00:00 GMT Cache-Control no-cache, no-store, max-age=0, must-revalidate Pragma no-cache Content-Disposition inline;filename*=utf-8'en'to...v.pdf Last-Modified Tue, 08 Feb 2011 05:25:51 GMT Content-Encoding , gzip Vary Accept-Encoding Content-Length 48636 Content-Type application/pdf X-Cache MISS from aristotle.citizendium.org X-Cache-Lookup MISS from aristotle.citizendium.org:80 Via 1.0 aristotle.citizendium.org:80 (squid/2.6.STABLE21) Connection keep-alive
Note that the Accept-Encoding header in the request specifies "gzip, deflate", while the Content-Encoding value in the response is ", gzip". I have posted a query about the problem on http://groups.google.com/group/ comp.infosystems.www.servers.unix, but have not received a response. (I posted the query over two weeks ago).
This problem occurs on both our live wiki, which runs on 1.13.2 as well as our test wiki, which runs on 1.16.2. The apache version is 2.2.3.
Looks very odd.
Do you get the same Content-Encoding if performing the from the squid? And if sent directly to the apache?
On Thu, 10 Mar 2011 01:29:15 +0100, Platonides wrote:
Looks very odd.
Do you get the same Content-Encoding if performing the from the squid? And if sent directly to the apache?
It was a good question. I hadn't tried to bypass squid. However, when I do, I get the same result.
Dan Nessett wrote:
On Thu, 10 Mar 2011 01:29:15 +0100, Platonides wrote:
Looks very odd.
Do you get the same Content-Encoding if performing the from the squid? And if sent directly to the apache?
It was a good question. I hadn't tried to bypass squid. However, when I do, I get the same result.
Ok, I have been able to reproduce it. This comes as a combination of mod_deflate, ob_gzhandler and mediawiki.
When serving files, mediawiki clears any gzipping layer, including its own one. You seem to have at php.ini output_handler=ob_gzhandler. When mediawiki detects that ob_gzhandler is active, performs ob_end_clean() and does header( 'Content-Encoding:' ); in order to clean the Content-Encoding field (otherwise you would get plain data with header saying it's in gzip). Then, you also have mod_deflate into the mix. It detects an existing Content-Encoding header, and apr_table_mergen "merges" adding ', gzip' despite the header being empty.
Where is the bug? mod_deflate shouldn't concatenate if the field is empty. php could skip passing Content-Encoding to other modules if empty. MediaWiki could use the header( 'Content-Encoding: identity' ); instead of header( 'Content-Encoding:' );
How can _you_ fix it right now? You don't need having three compressing layers. I'd deactivate mod_deflate and output_handler=ob_gzhandler, letting mediawiki compress the pages automatically for you. Just disabling mod_deflate or output_handler=ob_gzhandler would work too, but note that keeping mod_deflate with your current configuration will compress streamed files, which is likely to be inefficient.
rfc2616 section 14.11 defines Content-Encoding header as "Content-Encoding" ":" 1#content-coding The #rule (see section 2) requires at least one content-coding to be present, which MediaWiki is currently violating (yes, the empty header does arrive at the user browser).
I have explicitely set the Content-Encoding as identity in r84060, using header_remove if available.
Opened it as bug 28069 https://bugzilla.wikimedia.org/show_bug.cgi?id=28069
Iirc we originally used Content-encoding: identity but found this broke some clients. Probably worth pulling out the old bugs to check if the issue is still present.
-- brion On Mar 15, 2011 3:25 PM, "Platonides" Platonides@gmail.com wrote:
Dan Nessett wrote:
On Thu, 10 Mar 2011 01:29:15 +0100, Platonides wrote:
Looks very odd.
Do you get the same Content-Encoding if performing the from the squid? And if sent directly to the apache?
It was a good question. I hadn't tried to bypass squid. However, when I do, I get the same result.
Ok, I have been able to reproduce it. This comes as a combination of mod_deflate, ob_gzhandler and mediawiki.
When serving files, mediawiki clears any gzipping layer, including its own one. You seem to have at php.ini output_handler=ob_gzhandler. When mediawiki detects that ob_gzhandler is active, performs ob_end_clean() and does header( 'Content-Encoding:' ); in order to clean the Content-Encoding field (otherwise you would get plain data with header saying it's in gzip). Then, you also have mod_deflate into the mix. It detects an existing Content-Encoding header, and apr_table_mergen "merges" adding ', gzip' despite the header being empty.
Where is the bug? mod_deflate shouldn't concatenate if the field is empty. php could skip passing Content-Encoding to other modules if empty. MediaWiki could use the header( 'Content-Encoding: identity' ); instead of header( 'Content-Encoding:' );
How can _you_ fix it right now? You don't need having three compressing layers. I'd deactivate mod_deflate and output_handler=ob_gzhandler, letting mediawiki compress the pages automatically for you. Just disabling mod_deflate or output_handler=ob_gzhandler would work too, but note that keeping mod_deflate with your current configuration will compress streamed files, which is likely to be inefficient.
rfc2616 section 14.11 defines Content-Encoding header as "Content-Encoding" ":" 1#content-coding The #rule (see section 2) requires at least one content-coding to be present, which MediaWiki is currently violating (yes, the empty header does arrive at the user browser).
I have explicitely set the Content-Encoding as identity in r84060, using header_remove if available.
Opened it as bug 28069 https://bugzilla.wikimedia.org/show_bug.cgi?id=28069
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
On Wed, 16 Mar 2011 10:01:59 -0700, Brion Vibber wrote:
Iirc we originally used Content-encoding: identity but found this broke some clients. Probably worth pulling out the old bugs to check if the issue is still present.
-- brion On Mar 15, 2011 3:25 PM, "Platonides" Platonides@gmail.com wrote:
Dan Nessett wrote:
On Thu, 10 Mar 2011 01:29:15 +0100, Platonides wrote:
Looks very odd.
Do you get the same Content-Encoding if performing the from the squid? And if sent directly to the apache?
It was a good question. I hadn't tried to bypass squid. However, when I do, I get the same result.
Ok, I have been able to reproduce it. This comes as a combination of mod_deflate, ob_gzhandler and mediawiki.
When serving files, mediawiki clears any gzipping layer, including its own one. You seem to have at php.ini output_handler=ob_gzhandler. When mediawiki detects that ob_gzhandler is active, performs ob_end_clean() and does header( 'Content-Encoding:' ); in order to clean the Content-Encoding field (otherwise you would get plain data with header saying it's in gzip). Then, you also have mod_deflate into the mix. It detects an existing Content-Encoding header, and apr_table_mergen "merges" adding ', gzip' despite the header being empty.
Where is the bug? mod_deflate shouldn't concatenate if the field is empty. php could skip passing Content-Encoding to other modules if empty. MediaWiki could use the header( 'Content-Encoding: identity' ); instead of header( 'Content-Encoding:' );
How can _you_ fix it right now? You don't need having three compressing layers. I'd deactivate mod_deflate and output_handler=ob_gzhandler, letting mediawiki compress the pages automatically for you. Just disabling mod_deflate or output_handler=ob_gzhandler would work too, but note that keeping mod_deflate with your current configuration will compress streamed files, which is likely to be inefficient.
rfc2616 section 14.11 defines Content-Encoding header as "Content-Encoding" ":" 1#content-coding The #rule (see section 2) requires at least one content-coding to be present, which MediaWiki is currently violating (yes, the empty header does arrive at the user browser).
I have explicitely set the Content-Encoding as identity in r84060, using header_remove if available.
Opened it as bug 28069 https://bugzilla.wikimedia.org/show_bug.cgi?id=28069
_______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
I searched for bugs with "Content-Encoding identity" in either the summary or comments. None were returned. I then searched for only "Content-Encoding" and only the current 28069 was returned. So, perhaps there are other search criteria that will give us a list of the old bugs.
On Fri, Mar 18, 2011 at 11:18 AM, Dan Nessett dnessett@yahoo.com wrote:
I searched for bugs with "Content-Encoding identity" in either the summary or comments. None were returned. I then searched for only "Content-Encoding" and only the current 28069 was returned. So, perhaps there are other search criteria that will give us a list of the old bugs.
Looks like I filed it in Mozilla's bug tracker during the Firefox 2 development days; since the next version of the biggest cross-platform browser didn't support it, I never actually committed the version using 'identity' to our SVN.
https://bugzilla.mozilla.org/show_bug.cgi?id=341944
Apparently the bug's been resolved somewhere between 2006 and 2010 (my test URL still pumps out the header, and it loads in Firefox 4 RC1), but I haven't done a thorough browser survey to see whether any release versions of Firefox 2 or 3 broke, or any other browsers or proxies or client apps would be similarly affected by it.
-- brion
mediawiki-l@lists.wikimedia.org