Kay Hamacher wrote:
What I can't access is the actual (uncompressed, human readable) content of the old_text-column itself. It's compressed, ok. I got that from the php-source of mediawiki. But what's wrong with the following code?
[snip]
($d, $status) = deflateInit(); #-Level => Z_BEST_COMPRESSION); $status == Z_OK or die "INIT failed\n" ; ($out, $status) = $d->deflate($_[0]) ;
[snip]
Two things: first, deflate() does the compression, so you want to use inflate() to decompress. (The 'de' is confusing, I slip up on that all the time too! Poor naming of the functions...)
Second, you have to match the settings that PHP's gzdeflate() function used to compress them, namely setting the window bits size to -MAX_WSIZE. This disables the checksum bytes, I think, which confuses the decompression unless you give it the same setting.
See this thread for some sample code: http://mail.wikipedia.org/pipermail/wikitech-l/2004-January/007989.html
-- brion vibber (brion @ pobox.com)